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Editorial Note 


This special issue introduces the recent progress in Fujitsu’s semiconductor technology. 
Because of the severe competition, advances in semiconductor technology are being made every 
day. This is also true at Fujitsu where some new products are continuously created and marketed. 
In this sense, this special issue can only introduce a momentary feature of the technology. 

Nevertheless the semiconductor technology consists of many different fields of engineering 
and progresses with a consistent unification of these fields. It is therefore necessary in an issue to 
include wide range of these fields in order to have a better introduction of the technology. This 
issue tries to cover as many topics as possible ranging from the latest device and manufacturing 
technologies to simulation technology. Total of 21 papers are thus field in this volume to represent 
most of these fields, in which review papers of individual technology of Fujitsu and original 
reports are properly contained. 

These papers are devided into 6 groups to give an easier vision of the whole contents. Three of 
these groups deals with the device technology. The first is the High Speed Devices which Fujitsu 
has the better reputation than any other company. Here the technologies representing three 
different fields of high speed device, namely those of Si, compound semiconductor and Josephson 
Junction (JJ), are reviewed. JJ device is included because it falls into the same category in the 
sense of high speed integrated circuit. Representing high speed compound semiconductor devices, 
HEMT is included because it has drawn great attention as Fujitsu’s original technology. 

Memories, ASICs, microprocessors, and other devices of the advanced technology are grouped 
into Large Scale Devices. These represent the Si technology as a main stream of the present semi- 
conductor technology. Total of 5 papers are listed here concerning the most advanced VLSI 
technologies developed at Fujitsu including those of Mbit DRAM, 32-bit microprocessor, and 
100-K gates gate array, each of these being at the level of the world largest scale. 

Third group of the device technology is the Compound Semiconductors. Review papers on 
two most popular device applications where Fujitsu has an excellent engineering carrier are given 
there together with a work on material characterization. 

Papers related to production techniques are devided into the Process Technology and the 
Manufacturing Technology for convenience sake. Those directly concerned with the wafer process 
are included in the former, where 4 topics on advanced bipolar process and others are included 
together with a review paper on sub-micron lithography technology. The latter contains those 
related to the manufacturing and mass production of the products, and two review papers on mask 
and packaging technologies and one reliability paper are included. 

Finally, two papers are in the Simulation Technology group. Included here are a report on 
circuit simulation system and a work on ASIC design tool. With these all contained, this issue is 
considered to cover all the supporting fields for the semiconductor technology. 

This concludes the outline of the aim and contents of this issue. We sincerely hope that this 
issue will be useful to all users of Fujitsu’s semiconductor products. We also hope that this issue 
will benefit to the other engineers and thus contributes to further advancement of the semicon- 


ductor technology. 


Dr. Takahiko Misugi, Associated Editor — Dr. Ryoiku Togei, Editorial Representative, 
Special Editor of this Issue 
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High-Speed Bipolar Logic IC 


e@ Ken-ichi Ohno @ Hirofumi Takeda 


Continuing advances in silicon technology have produced IC 
devices with increasingly high speed, high density, and low power 
dissipation. 

This paper reviews the trends at Fujitsu in high-speed bipolar 
logic IC devices. 

Methods suggested for improving the speed of ECL LSI devices 
are also outlined, and a 10K-gate ECL gate array of 100-ps and 
a 2.7 GHz prescaler are given as examples of advanced bipolar 
logic IC devices. 


UDC 621.3.049.771.14:621.382.3 
FUJITSU Sci. Tech. J., 24, 4, pp. 271-283(1988) 


Ultra-High-Speed HEMT LSI Technology 


e@ Masayuki Abe @ Takashi Mimura @ Masaaki Kobayashi 


High Electron Mobility Transistors (HEMTs) are very promising 
devices for ultra-high-speed LSI/VLSI because of the supermobili- 
ty GaAs/AlGaAs heterojunction structure. 

This paper discusses the current status and recent advances of 
HEMT technology for high-performance VLSI with a focus on 
materials, self-alignment device fabrication, and HEMT LSI im- 
plementations. 

HEMTs have already been used to develop a high speed 16 kbit 
static RAM and a 4.1 k-gate gate array. 


UDC 537.312.62 
FUJITSU Sci. Tech. J., 24, 4, pp. 284-292 (1988) 


Recent Advances in Josephson Junction Devices 


e@ Shinya Hasuo e@ Takeshi Imamura @ Norio Fujimaki 


This paper describes recent advances in high-speed digital circuits 
using all niobium (Nb/AlOx/Nb) Josephson junctions. The world’s 
fastest logic gate Modified Variable Threshold Logic (MVTL) is 
described. The MVTL gate family has been applied to various logic 
circuits such as a 16-bit ALU (Arithmetic Logic Unit) and a 4it 
microprocessor. The high-speed performance of Josephson junc- 
tions in LSI level circuits has been verified using these circuits. 

A new type of high sensitivity magnetic sensor, SQUID (Super- 
conducting QUantum Interference Device), has also been invented. 
It is called ‘‘a single-chip SQUID”, because all the circuits neces- 
sary for its operation have been integrated into a single chip. 


UDC 621.377:681.327.67 
FUJITSU Sci. Tech, J., 24, 4, pp. 293-300 (1988) 


Self-Timed RAM: STRAM 


@ Chikai Ohno 


A STRAM is different from conventional RAMs because it has 
synchronous operation and an on-chip write pulse generator. 
Three types of STRAMs are presented in this paper. Each type is 
a standard device and has unique features which are useful in 
various applications. 

A system model using STRAM was evaluated and it was shown 
that STRAM can improve the system level cycle speed to twice 
that of a conventional RAM. Using already established process 
technology, Fujitsu has developed a 1K x 4 standard STRAM hav- 
ing a cycle time of 9ns and 4K x 4 STRAM having a 13 ns cycle 
time. 


UDC 621.377.621:681.327.67 
FUJITSU Sci. Tech. J., 24, 4, pp. 301-317 (1988) 


3D Stacked Capacitor Cell for Mega Bit DRAM 


e@ Tomio Nakano e Takashi Yabu 


This paper discusses the three-dimensional stacked capacitor (3D 
STC) cell technology that Fujitsu used in 1-Mbit DRAMs (Fujitsu 
was the first to do this), and the development of 1-Mbit and 
4-Mbit DRAMs using the 3D STC technology. 3D STC technology 
is the key to cell area reduction enabling densities higher than 
1-Mbit. This technology provides mass production capability and 
a high immunity to alpha-particle-induced soft errors. To respond 
to market demands for low power consumption, high speed, and 
high reliability, 1-Mbit DRAMs were designed using CMOS tech- 
nology. A 4-Mbit DRAM having an access time of 56 ns and low 
power consumption of 175 mW was also developed. 


UDC 621,3.049.774 
FUJITSU Sci. Tech. J., 24, 4, pp. 318-327 (1988) 


Development of Sea of Gates 


e Yoshiyuki Suehiro @ Nobutake Matsumura e Gensuke Goto 


Sea of gates was introduced as an LSI that can integrate circuits 
of system level including memory functions. With this new type of 
LSI, highly integrated and high-performance CMOS LSI of 30K to 
160K gates have been successfully developed. The LSIs are fabri- 
cated with 1.04m or 1.24@m CMOS triple-metal-layer process 
technology. An original basic cell structure makes it easy to con- 
struct both memories and random logic circuits on the LSI chip. 
Furthermore, the unique structure of an 1/O buffer cell and the 
improvement in assembly technology realized multiple pins. 
Cavity-down packages of up to 401 pins were developed. 


UDC 621.3.049.774:681.323 
FUJITSU Sci. Tech. J., 24, 4, pp. 328-334 (1988) 


Development of Microcontroller: F7>MC 


e@ Jyoji Murakami 


The Fujitsu Flexible Microcontroller (F?7MC) has been developed 
to meet the market’s need for a high-performance application 
specific controller. This microcontroller features high-speed 
(0.33 us cycle time), efficient object code, and flexible design and 
can be applied to many areas. 

The CPU architecture, design philosophy, and technology fea- 
tures of the CPU are described and various application products 
are shown. 


UDC 621.3.049.774:681.323 
FUJITSU Sci, Tech. J., 24, 4, pp. 335-344 (1988) 


Development of 32-Bit Microprocessor Family Products: 
Gmicro F32 


e@ Shosuke Mori @ Koichi Fujita @ Haruyasu Itoh 


The technological trend toward higher integration and expanded 
functions of microprocessors has become increasingly important 
in the design of workstations and embedded controllers. 

This paper outlines the Gmicro F32 and key technologies for 
its development. 
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UDC 621.383.5:621.391.6 
FUJITSU Sci. Tech. J., 24, 4, pp. 345-358 (1988) 


Lightwave Semiconductor Devices 


e@ Haruo Yonetani @ Akira Fukushima @ Keiji Satoh 


Lightwave semiconductor devices are one of the keys to building 
lightwave communication systems. In this paper Fujitsu lightwave 
semiconductor devices now being produced, are reviewed accord- 
ing to individual device characteristics and system applications. 


UDC 621.382.3:621.396.6.029.6 
FUJITSU Sci. Tech. J., 24, 4, pp. 359-371(1988) 


Microwave Semiconductor Devices 


e Kiyofumi Ohta e Kenji Yano e Yutaka Hirano 


This report describes the state-of-the-art Fujitsu microwave semi- 
conductors. The important parameters of GaAs power FETs are 
efficiency, linearity and reliability. The design philosophy for 
these parameters and performance are discussed. Low noise per- 
formance of HEMTs has been demonstrated and a new 1/4 um 
gate HEMT has been developed having a 0.58 dB noise figure and 
12.35 dB of associated gain at-12 GHz. MMICs have high potential 
for wide band, small size, and lightweight equipment. GaAs FET 
modules and amplifiers are also described as examples of actual 
applications. 


UDC 621.315.5:620.187 
FUJITSU Sci. Tech. J., 24, 4, pp. 372-378 (1988) 


Characterization of Compound Semiconductor Materials 
by Transmission and Reflection Electron Microscopy 


@ |tsuo Umebu 


The interface structures of the GaAs/AlAs superlattice were 
analyzed by Transmission Electron Microscopy (TEM) with the 
help of computer simulation. The bright spot arrays which appear 
on the uppermost AlAs layer are shown to be a good indicator for 
this analysis. Atomic layer steps occur at intervals of 3-10 nm. The 
surfaces of MBE-grown GaAs layers were analyzed in detail with 
Reflection Electron Microscopy (REM). The surfaces consist of 
undulations and small steps. Anisotropic surface roughness may be 
due to anisotropic Ga surface diffusion. Atomic ordering in InGaP 
mixed crystals was analyzed by cross-section TEM, and a crystal 
model with double periodicity is proposed. 


UDC 621.3.049.774.3:621.7.04 
FUJITSU Sci. Tech. J., 24, 4, pp. 379-383(1988) 


Ultra High-Speed Bipolar Process Technology: ESPER 


e@ Tatsuya Deguchi @ Hiroshi Goto 


This paper describes an ultra high-speed bipolar process technol- 
ogy using en Emitter-base Self-aligned structure with Polysilicon 
Electrodes and Resistors (ESPERs). This structure, combined with 
trench isolation, drastically reduces parasitic capacitances and 
resistances, realizing a sub-40 ps ECL circuit and high-performance 
bipolar devices. 


UDC 621.3.049.774 
FUJITSU Sci. Tech. J., 24, 4, pp. 384-390(1988) 


High-Speed BiCMOS Technology with Polysilicon Emitter 
Structure 


e@ Hiroyuki Fukuma @ Tsunenori Yamauchi @ Yoshinori Okajima 


This paper describes a high-speed BiCMOS technology which 
consists of bipolar process technology using polysilicon emitter 
and CMOS process technology using the 1.0 zm-rule. 

The high-speed characteristics of the BiCMOS were obtained: 
The cutoff frequency (f+) of the bipolar npn transistor was found 
to be 6 GHz with a propagation delay time (tpq) for the CMOS 
gate of 0.5 ns. The high performance of the conventional bipolar 
device and CMOS device were also maintained. 

The BiCMOS technology has been applied to fabricate a 2 000- 
gate gate array and a 256-Kbit SRAM. The results of these devices 
are also reported. 


UDC 621.382.33:621.7.04 
FUJITSU Sci. Tech. J., 24, 4, pp. 391-397 (1988) 


Characteristics of Si HBT with Hydrogenated Micro- 
Crystalline Si Emitter 


e@ Hiroshi Fujioka @ Kanetake Takasaki 


An npn Si HBT has been fabricated using hydrogenated micro- 
crystalline Si as a wide gap emitter. It shows much higher common 
emitter current gain than a conventional homo-unction transistor. 
The measured common emitter current gains of the fabricated 
HBTs having intrinsic base sheet resistance of 14k&/O and 
95 2/0 are 1500 and 18. respectively. The present HBT can per- 
form normal operation even at liquid nitrogen temperature. 


UDC 621.382.32:621.7.04 
FUJITSU Sci. Tech. J., 24, 4, pp. 398-407 (1988) 


Application of EB-Lithography for Fabrication of 
Submicron-Gate-MOSFETs 


e@ Shuzo Ohshio e Tetsuo Izawa 


The characteristics and the proximity effect of resist in electron- 
beam direct-writing was studied to form submicron patterns. 
MOSFETs were obtained with an effective channel length down to 
about Leff = 0.32 um. The transistors fabricated using this tech- 
nique operate well without punch-through. By evaluating the 
dispersion of many transistors, it was found that there is a strong 
Possibility that devices with a minimum pattern size of 0.2-0.3 um 
can be manufactured for practical use. 

The influence of the electron-beam direct-writing on the relia- 
bility of devices was studied and it was confirmed that this 
method is sufficiently reliable when used for gate-electrode forma- 
tion. 


UDC 621.382.2:621.7.04 
FUJITSU Sci. Tech. J., 24, 4, pp. 408-417 (1988) 


SOI-Device on Bonded Wafer 


@ Hiroshi Gotou @ Yoshihiro Arimoto 
@ Masashi Ozeki @ Kazunori Imaoka 


The bonded wafer technique to fabricate SOI (Silicon On Insula- 
tor) devices has been extensively studied. This technique has been 
successfully applied to fabrication of a 64-Kbit SOI-DRAM, which 
exhibits a low soft error rate up to 1/7 that of conventional 
DRAMs, depending on the substrate thickness of the bonded 
wafer. It was found that the soft error depended on the base sub- 
strate bias voltage. 

It is also shown that the bonded wafer technique can solve the 
latchup problem in CMOS, and is advantageous when used in the 
fabrication of SOI! bipolar devices. Two new types of bipolar tran- 
sistor are proposed. 
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UDC 621.3.049.774:621.7.04 
FUJITSU Sci. Tech. J., 24, 4, pp. 418-431 (1988) 


Overview of Mask Technology 


e@ Kimio Yanagida e Takao Furukawa e Takeo Kikuchi 


Mask-making technology plays an important role in generating 
the patterns for devices used in semiconductor fabrication. Fujitsu 
recognizes the importance of this technology and has developed it 
from the early days of semiconductor fabrication. 

This paper describes the mask-making technologies currently 
being used: data processing, exposure, process and inspection 
technology. This is followed by a discussion of the future trends in 
mask-making technology. 


UDC 621.3.049.76 
FUJITSU Sci. Tech, J., 24, 4, pp. 432-445 (1988) 


Packaging Technology for ASICs 


e@ Michio Sono 


The use of ASICs is rapidly progressing and expanding. ASIC 
packaging technologies are also enabling higher density, diversifica- 
tion, and customization. This review introduces the package line- 
up and ASIC packages currently offered by Fujitsu, then describes 
current technological problems and future trends. 


UDC 621.3.049.774 
FUJITSU Sci. Tech. J., 24, 4, pp. 446-455 (1988) 


Reliability on Short-Channel MOSLSIs 


e Ken Shono e Kenji Ishida e Nagao Yamada 


This paper describes the reliability problems associated with the 
scaling down of MOSLSI. Because the power supply voltage is not 
scaled, the internal electric field increases, causing reliability prob- 
lems such as degradation of MOSFET due to hot carrier generation 
and electric breakdown of the gate oxide. Furthermore, the high- 
speed operation and narrow metalization width requires the 
current density to increase, which reduces the interconnection 
reliability. 

These problems can be overcome by improving the device struc- 
ture and fabrication technology. 


UDC 621,3.049.774.3 
FUJITSU Sci. Tech. J., 24, 4, pp. 456-463(1988) 


Bipolar Circuit Simulation System Using Two- 
Dimensional Device Simulator 


e@ Shigeo Satoh @ Hideki Oka @ Noriaki Nakayama 


To accurately estimate bipolar circuit performance, two circuit 
simulation systems using a two-dimensional device simulator have 
been developed and compared. One uses the table method and the 
other uses the direct method. 

The direct method was found to be superior to the table meth- 
od. Using the direct method, the propagation delay time of an 
ECL gate has been estimated and the influence of extrinsic ele- 
ments and collector impurity concentration on delay time were 
investigated, 


UDC 621.3.049.76:681.323 
FUJITSU Sci. Tech. J., 24, 4, pp. 464-468 (1988) 


Speed Tunable Finite State Machine Compiler: 
ZEPHCAD™ 


e@ Hitomi Sato e Yoshihide Sugiura e Masahiro Fujita 


This report describes a personal computer based state machine 
compiler. This system transforms the finite state machine descrip- 
tion, Boolean equations, or truth table into net lists of the CMOS 
gate array or standard cell. 

A new feature of this system is the capability to tune the circuit 
speed from the user side by selecting the number of logic levels 
when running the system. 

This report includes bench mark results. 


Preface to the Special Issue on 
Semiconductors 


@ Hiroyuki Ino General Manager of Semiconductor Group 


Due to remarkable developments in the electronics technology, modern society continues to 
evolve and grow at a much higher speed than we previously expected. Electronics products are 
now widely used in our daily lives at work and at home. Present society has thus become highly 
information oriented, which in turn has stimulated industries to expand even more. Needless to 
say, the semiconductor technology is the basic technology supporting this information oriented 
society and these industrial developments. 

Fujitsu quickly recognized the importance of semiconductor technology at an early time and 
thus started its research and development in as early as 1950’s. Since then the semiconductor 
development at Fujitsu has formed one of the three major business areas of the company together 
with those of computer and communication, As a consequence, the semiconductor technology at 
Fujitsu has grown to become the key to Fujitsu’s high-quality electronics products. For example, 
the 100 gates ECL gate array developed in 1973 enabled realization of a full scale LSI computer 
for the first time in the world. Since then Fujitsu semiconductor technology has made it possible 
to develop a series of computers with steadly increased performance and scale. Then, after having 
accumulated technological improvements of this period, Fujitsu started full-scale supply of the 
semiconductor products to domestic market about 15 years ago, and then to overseas market 
about 10 years ago. We have now established customer support systems in North America, Asia 
and Europe by the three subsidiaries of FMI, FMP and FMG, respectively. 

With this historical background, Fujitsu semiconductor technology has always progressed with 
the aim of producing most advanced devices of the time in terms of performance and scale of 
integration. Following the development of world’s first bipolar gate array in 1970’s, Fujitsu has 
continuously introduced high speed and large scale gate arrays. For example, in late 1970’s we 
developed 400 gates ECL gate array that has the highest speed in the world at that time. Presently, 
ultra-high-speed gate arrays of 4500 gates and 10000 gates are in the products list. This year, a 
CMOS gate array of 100000 gates was developed and distributed to the customers. With all of 
these, our gate arrays are favorably received by many international customers and have the No. 1 
market position in the world. In addition, we now began commercial production of Standard Cell 
and has already developed 60 000 gates general purpose DSP. 

Outstanding achievements have also been made in the field of memory. For example, in 1978 
Fujitsu was the first in the world to commercially produce a 64 Kbit DRAM, and now we are 
developing a 16 Mbit DRAM following the shipment of the 4Mbit DRAM. Also in the begining 
of 1980’s an ECL DRAM with world fastest access time of 5 ns was developed, and presently a 
256 Kbit ECL RAM made by BiCMOS technology is supplied to the market. 

Fujitsu microprocessor has new family of 32-bit in addition to 4-bit, and 8- and 16-bit micro- 
processor families. Two types of the new families are developed and introduced to the market, one 
is the Gyycro series based on the TRON architechture and the other is SPARC series based on an 
ultra high speed architechture (RISC). 
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Fujitsu compound semiconductor also have No. | or 2 market share in the field of microwave 
and optical communication applications. Our activities in this area range from those of basic 
research to application technology. This has resulted in great developments in new compound 
semiconductor products, and has led to Fujitsu’s invention of the HEMT and the RHET. 

Although Fujitsu always aims to develop devices of the most advanced level, we believe that 
achieving high reliability of the device is even more important. As the scale of devices increases and 
their functions become more complex, high reliability becomes extremely important. This concept 
forms one of the important policies for the development of Fujitsu semiconductors. 

This special issue has been published to introduce some of these Fujitsu semiconductor tech- 
nologies to our international clients. It contains various articles from a wide range of technology 
fields. It includes in particular the latest device technology and process technology, and R and D 
results of advanced device technology. We hope that this issue will be useful in the designing 
electronic equipment and systems, and that it will assist future development plans. 

Fujitsu will continuously endeavor to develop devices with high performance and high re- 
liability at a low cost in order to meet the requirements of clients. 
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High-Speed Bipolar Logic IC 


@ Ken-ichiOhno @ Hirofumi Takeda 


(Manuscript received September 6, 1988) 


Continuing advances in silicon technology have produced IC devices with increasingly high 


speed, high density, and low power dissipation. 


This paper reviews the trends at Fujitsu in high-speed bipolar logic IC devices. 
Methods suggested for improving the speed of ECL LSI devices are also outlined, and a 10K- 
gate ECL gate array of 100-ps and a 2.7 GHz prescaler are given as examples of advanced 


bipolar logic IC devices. 


1. Introduction 

Transistor-transistor logic (TTL) and emitter- 
coupled logic (ECL) are the main circuit types 
of bipolar logic ICs. BiCMOS, a combination of 
bipolar and CMOS circuits, will become widely 
used in the medium-speed regions between 
ECL and the higher-speed CMOS and TTL. 

Due to their high-speed operation, ECL 
integrated circuits (ICs) are now used in high- 
performance systems such as super computers, 
general purpose mainframes, communications 
equipment, and LSI testers. These ICs are being 
targeted for use in other equipment, including 
super minicomputers and engineering work 
stations. 

This paper focuses on the trends in ECL 
high-speed bipolar ICs and introduces the ECL 
technology developed at Fujitsu. 


2. History of Development at Fujitsu 
2.1 High-speed bipolar logic IC applications!) ”? 

Table 1 outlines the 20 years of develop- 
ment of high-speed bipolar logic ICs at Fujitsu. 

Fujitsu’s first ECL prototype was developed 
over 20 years ago. The MB700 series of ECL ICs 
went into mass production in 1970. 

The 126-gate gate array developed in 1970 is 
the original gate array configuration. The circuit 
type was non-threshold logic (NTL). ECL 
standard ICs of 2 ns and 126-gate NTL gate 
arrays were used in the FACOM 230-75, Fujitsu’s 


FUJITSU Sci. Tech. J., 24, 4, pp. 265-270 (December 1988) 


highest performance computer at the time. 

The 100-gate ECL gate array” was devel- 
oped in 1973. This was the first real gate array 
and was named the MBIIK series. This array 
was used for the first all-LSI] computer, the 
FACOM M-190. 

ECL gate arrays have been used in top-of- 
the-line computers since 1973. For example, the 
400-gate ECL (MB12K)*? is used in the FACOM 
M-380. The 3K-gate ECL (MB38K), and 1.2K- 
gate ECL with 16K-bit RAM (MB77K) are used 
in the FACOM M-780°). 

We developed the ET series (ET3000 and 
ET4500) in 1986 for the general market. 

This series was based on the ECL gate array 
technology developed for the computers men- 
tioned above and on the original general-purpose 
gate array (MB33K). The original ET is a typical 
general-purpose ECL gate array with mixed ECL 
and TTL I/O interface levels. The ETM series 
(ET2009M and ET3004M) developed in 1987 is 
an ET with ECL RAM on a single chip. The ETH 
series (ETIOOOOH)® developed in 1988 is a 
higher-speed ET. 

In addition to these gate arrays, high- 
performance prescalers were developed using 
silicon ECL technology”). 


2.2 Trends in high-speed bipolar logic ICs. 


Figure 1 shows the trends in basic propaga- 
tion delay time (7pq) for the Fujitsu products 
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Table 1. Development of high-speed bipolar logic ICs at Fujitsu 


Years Standard IC family Gate arrays and gate masterslices Others (e.g. ASSP) 

1969 2 ns ECL (MB700) 

1970 1 ns ECL (MB800) 126-gate NTL (MB9200) 

1972 2 ns ECL (MB10K) 

1973 100-gate ECL (MB11K) 

1975 0.5 ns ECL (MB810) 

1978 400-gate ECL (MB12K) 

1982 1 GHz prescaler (MB501) 

1983 1K-gate ECL (MB33K) 

1984 3K-gate ECL (MB38K) 1.1 GHz prescaler (MB501L)” 
1.2K-gate ECL with RAM (MB77K) 

1985 1.6 GHz prescaler (MB505) 

1986 3K-gate ECL (ET3000) 2.4 GHz prescaler (MB506) 
4.5K-gate ECL (ET4500) 

1987 0.15 ns ECL (MB880) 2K-gate ECL with RAM (ET2009M) 
3K-gate ECL with RAM (ET3004M) 
1 Gbit/s ECL gate masterslice (E32) 

1988 10K-gate ECL (ET10000H) 1.1 GHz prescaler with PLL (MB1501) 
2.5 Gbit/s gate masterslice (El 28H) 2.7 GHz prescaler (MB510) 


2 000 : Internal ECL gate array gate 


: External ECL standard IC gate 


1 000 4: Internal NTL gate array gate 


500 


Toa (ps) 


100 


1970 1980 1990 


Years 


Fig. 1—Trends in basic propagation delay time (7pq). 


listed in Table 1. As shown in Fig. 1, ECL tech- 
nology is making steady progress in improving 
propagation delay time. In 1987®) , 40 ps ECL 
samples were obtained experimentally; this 
strengthened this trend. 

The Tpa per gate depends on the circuit’s 
power dissipation (Pq). The relation between the 
Tpa and Pg of the devices in Fig. 1 is shown in 
Fig. 2. Figure 2 also shows the trends of these 
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Fig. 2—Tpq vs. Pa per gate. 


two features in internal gates and external gates 
of ECL ICs. Progress is being made towards 
higher speed and lower power dissipation 
internal gates. For external gates, progress is 
being made towards higher speed only. 

High-density devices with high gate counts 
per chip are effective in systems that require 
high speed, low cost, small size, and high reli- 
ability. Figure 3 shows that the gate counts per 
chip for the above ECL gate arrays are increasing 
steadily at about 40 percent a year. 
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Fig. 3—Trends in gate counts. 


3. Speed Improvements in ECL LSI 
3.1 ECL circuits 

Figure 4 shows three examples of ECL cir- 
cuits. An ECL circuit consists of a current 
switch, an emitter follower, and a common 
circuit to generate bias voltages. The main 
advantages of the ECL circuit are high-speed 
operation and strong logic functions. 

Figure 4a) shows an example of an external 
basic gate used for an output gate terminated 
outside the chip. The external gate has a high- 
power dissipation of more than 20 mW per gate 
because it drives a low-resistance terminator 
equal to the transmission line impedance. 

Figure 4b) shows an example of an internal 
basic gate with the same threshold voltage and 
lower signal voltage swing as the external gate. 
The internal gate is terminated by a high-value 
on-chip resistor. Low power dissipation can thus 
be selected based on the 7,q required and on 
the water process technology used. 

A series-gate ECL circuit can implement 
complex logic functions such as a D-latch with 
reset as shown in Fig. 4c). 

Table 2 lists examples of typical signal 
voltages for the external ECL and internal ECL 
gates of Fig. 4. 
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a) External basic ECL gate (3-input OR/NOR gate) 


GND 


20 OL JTor 
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b) Internal basic ECL gate (4-input OR/NOR gate) 


GND 


Vers (—5.2 V/—4.5 V) 


c) Internal series gate ECL (D-latch with reset) 


Fig. 4—Examples of ECL circuits. 


Table 2. Examples of typical signal voltages for ECL 


in Fig. 4. 
Gate 
Voltage 5 
External (V)| Internal (V) 
High-level output (Voy) —0.85 —1.05 
Low-level output (Vo;) =i 75 1 55 
: | J. 

Input threshold (Vy) —1.3 1.3 


3.2 Major factors in basic propagation delay 
time 
Table 3 lists the major electrical factors 
affecting internal basic delay. The factors are 
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Table 3. Major factors affecting ECL circuit 
propagation delay time 


a a 


Factors Symbol 
Circuit configuration 
Circuit | Power dissipation (Circuit current) Pp 
Signal voltage swing V; 
Cutoff frequency tt 


E-Bjunctioncapacitance | Crp 


Transistor | C-B junction capacitance} Cop 


Device C-I junction capacitance Cel 
Base resistance Rg 
Resistor Stray capacitance CR 
Stray capacitance Cm 
Meta : 
Metal resistance Ru 
E: Emitter B: Base C: Collector I: Isolation 


classified into those of circuit design and device 
technology. 

Circuit design is mainly affected by circuit 
configuration, power dissipation, and signal 
amplitude. Faster speed can normally be gained 
by a higher power dissipation and smaller signal 
amplitude. 

To improve the speed, the cutoff frequency 
(f;) must be increased and the junction capaci- 
tance, base resistance of the transistor, and stray 
capacitance of the resistors must be decreased. 
To do this, Fujitsu has developed a fine pattern 
process and improved device structures such as 
the oxide surrounded transistor (OST)” , U- 
grooved isolation with thick field oxide (U- 
FOX)?), and emitter-base self-aligned structure 
with polysilicon electrodes and 
(ESPER)®?. 

The high-speed bipolar wafer process tech- 
nology is discussed separately in this issue!®?. 


resistors 


3.3 Main load delay factors 

The real gate delay of a chip consists of the 
basic delay and load delay. The main load delay 
in ECL LSI is the propagation delay time 
through wires connecting gates on the chip. This 
load delay depends on: the drivability of the 
driver gates, the wiring line capacitance, and 
wiring line resistance of connections from the 
driver gate to the receiver gates on the chip. 
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ET10000H internal gate. 


Due to progress made in shortening the basic 
delay and increasing chip density, the wiring 
delay has become relatively longer as compared 
to the basic delay. An example of the wiring 
delay is given in Sec. 4.1. 


4. Examples of advanced bipolar logic IC 
devices 
4.1 100 ps ECL gate array (ET10000H) 

A 100 ps 10K-gate ECL gate array with 
mixed ECL and TTL interface levels was devel- 
oped for the general market in 1988°). 

The internal basic gate is a 4-input OR/NOR 
gate. Complex logic functions are realized by 
using a series-gate circuit. The sample circuits are 
the same as those of Fig.4. To incregse and 
adjust the drivability of the internal gate, four 
types of emitter follower current (/pp) are 
selected by changing the resistance of Ry, 
shown in Figs. 4b) and 4c). 

Figure 5 shows the variations of the delay 
due to chip wiring line length in this device 
(ETIO0O00H). The basic 7yq is 100 ps, the 
average Tpq with 3 mm wiring, 3-fan in (F/I), 
and 3-fan out (F/O) is about 300 ps at an 
emitter follower current of 0.8 mA. 

The wafer process technology used for the 
ET10000H is the ESPER®? process. This process 
realizes an emitter width of 0.5 wm and three- 
layer metallization. 

Figure 6 shows the 13 mm x 13 mm chip. 

The chip is mounted on a 208-pin pin grid 


FUJITSU Sci. Tech. J., 24, 4, (December 1988) 


Si) : ut its) SL 
! He i if 
a oe 


ianildiath ii 
i ' : WL Hitt a, 
uk 


i ae 


j it eit 


ph) 
Ti 


By otitis Trt The 
/: r 
Sates aes ees 


10mm 


Fig. 6—ET10000H chip (13 mm”). 


Fig. 7—PGA208-packaged ET10000H. 


array package (PGA208) or a 260-pin quadruple 
flat package (QFP260). Figure 7 shows a PGA- 
208-packaged ETIOQOOH. The chip has a power 
dissipation of 13 W and has a low thermal 
resistance of 2.4 °C/W in an air flow of 5 m/s. 

A comprehensive computer aided design 
(CAD) program has been developed for the 
ET10000H. This enables devices to be laid out 
and fabricated after logic design with a short 
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Fig. 9-MBS510 prescaler chip (1.35 mm x 1.3 mm). 


turnaround time and without errors. 


4.2 2.7 GHz prescaler (MB510) 

Fujitsu has developed a series of prescaler 
ICs by using the high-speed characteristics of 
bipolar ECL technology. The first gigahertz 
prescaler was the MB501 developed in 1982. 
This device was a 1 GHz prescaler with 150 mW 
per chip. A high level version of the MB501, the 
MBSOIL, with higher speed and lower power 
consumption was developed in 1984. This device 
was a 1.1 GHz prescaler operating up to 1.6 GHz 
at 50 mW per chip”? 

One of the newest prescalers is a 2.7 GHz 
model called the MBS5S10. It was developed in 
1988 and operates up to 2.7 GHz at 50 mW per 
chip. Some devices even operate at 3.6 GHz. 

Figure 8 shows the logic block diagram of 
the MB510 prescaler. Each D flip-flop is formed 
by two series-gate ECL circuits, and Fig. 9 shows 
the 1.35 mmx 1.3mm chip. The packages are 
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Fig. 1O—MBS510 prescaler packages. 


8-pin DIPs and 8-pin small outline packages 
(SOP8), shown in Fig. 10. 


5. Conclusion 

Over the last 20 years Fujitsu has developed 
high-speed logic ICs (mainly ECL type) and 
continues to achieve higher speeds and densities. 

Fujitsu has, and will continue to use its 
experience to develop and market high-perform- 
ance ECL gate arrays, ECL gate masterslices, 
and gigahertz prescalers for high-speed computers 
and other equipment requiring high-speed 
devices. 
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High Electron Mobility Transistors (HEMTs) are very promising devices for ultra-high-speed 
LSI/VLSI because of the supermobility GaAs/AlGaAs heterojunction structure. 

This paper discusses the current status and recent advances of HEMT technology for high- 
performance VLSI with a focus on materials, self-alignment device fabrication, and HEMT 


LSI implementations. 


HEMTs have already been used to develop a high speed 16 kbit static RAM and a 4.1k-gate 


gate array. 


1. Introduction 

Eight years have now passed since the 1980 
announcement of the High Electron Mobility 
Transistor (HEMT)!?. HEMT technology has 
opened the door to new possibilities for ultra- 
high-speed large-scale integration (LSI)/very 
large-scale integration (VLSI) applications??. 
The evolution of high-speed HEMT integrated 
circuits (ICs) is the result of continuous progress 
in the utilization of supermobility in the GaAs/ 
AlGaAs heterojunction structure. Electron mo- 
bility in the conventional GaAs metal semi- 
conductor field-effect transistor (MESFET) 
channel with typical donor concentrations of 
around 10!7 cm? ranges from 4000 cm?/V:s 
to 5000 cm?/V-s at room temperature. Because 
of ionized impurity scattering, the mobility in 
the channel at 77 K is not much higher than it is 
at room temperature. However, in undoped 
GaAs, electron mobility of 2 to 3 x 10° cm/V-s 
has been obtained at 77K. The mobility of 
GaAs with feasibly high electron concentrations 
(i.e. feasible for device fabrication) was _ in- 
creased by using the modulation-doping 
techniques that were demonstrated in GaAs/ 
AlGaAs superlattices”). The first application of 
these electron-mobility-enhanced phenomena to 
the new transistor approach was a high electron 
mobility transistor (HEMT). This device was 
based on modulation-doped GaAs/AIGaAs single 
heterojunction structure!» and had a greatly 
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improved 77 K channel mobility. 

HEMT technology shows promise in ultra- 
high-speed LSI/VLSI applications!»*©. Due 
to the supermobility GaAs/AlGaAs heterojunc- 
tion structure, the HEMT is suitable for 
operation at liquid nitrogen temperature. 
In 1981 a HEMT ring oscillator with a gate 
length of 1.7 4m demonstrated a switching delay 
of 17.1 ps with a power dissipation of 0.96 mW 
per gate at 77 K. This indicated that switching 
delays below 10 ps are achievable with 1 um 
gate devices”. Switching delays of 5.8 ps with 
a power dissipation of 1.76 mW per gate at 77 K 
and 10.2 ps with a power dissipation of 
1.03 mW per gate at 300 K have been achieved 
with a 0.35 um gate device’. Even at room 
temperature, a switching delay of 9.2 ps with 
4.2mW per gate has been obtained with 
a 0.28 um gate device®. 

For LSI level complexity, HEMT technology 
has been used to develop a 4kbit static 
RAM?”?!9 , a 16 kbit static RAM!” as a memo- 
ry circuit, and a 4.lk-gate gate array with a 
16 x 16-bit parallel multiplier’? as a logic 
circuit. The 4 kbit static RAM has an address 
access time of 500 ps with a power dissipation 
of 5.7 W per chip. This device has ECL com- 
patible levels!®). The 16 x 16-bit parallel multi- 
plier designed on a 4.1k-gate gate array has a 
multiply time of 4.1 ns with a power dissipation 
of 6.2 W. HEMT technology has already entered 
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the LSI/VLSI application field. 

This paper first presents the performance 
advantages of HEMT approaches with a focus 
on scaled-down device structure in the sub- 
micron dimensional range. Next we will describe 
a HEMT technology for VLSIs including materi- 
als and self-alignment device fabrication tech- 
nology. We will review the current status and 
recent advances in HEMT logic and memory 
LSI circuit implementation and project future 
HEMT VLSI prospects in ultra-high-speed 
computer applications. 


2. Performance advantages of HEMT approaches 

HEMT technology presents new oppor- 
tunities for high-speed low-power LSI/VLSI. 
This section describes the principles of the 
HEMT and the HEMT performance when it is 
scaled-down to the submicron dimensional 
range. 


2.1 HEMT principles 

Figure 1 shows a cross-sectional view of the 
basic HEMT with a selectively doped GaAs/ 
AlGaAs heterojunction structure. An undoped 
GaAs layer and an Si-doped n-type AlGaAs layer 
are successively grown by Molecular Beam 
Epitaxy (MBE) on a semi-insulating GaAs sub- 
strate. Because of the higher electron affinity 
of GaAs, free electrons in the AlGaAs layer 
move to the undoped GaAs layer and form a 
two-dimensional high-mobility electron gas with- 
in 10 nm of the interface. 

As the temperature decreases, the electron 
mobility, which was about 8 x 10° cm?/V<s at 
300 K, increases dramatically and reaches 2 x 
10° cm?/V-s at 77 K due to reduced phonon 
scattering. A further increase with a considerable 
gradient occurs below 50 K. A maximum value 
of 1.5 x 10° cm?/V-s in the dark and 2.5 x 
10° cm?/V-s under light illumination is attained 
at 4.2 K. 

In HEMT structures, the AlGaAs layer 
heavily doped with donors like Si contains DX 
centers!*) that behave as electron traps at low 
temperatures. Certain anomalous behavior at 
low temperature is believed to be related to 
these traps. These phenomena include: distor- 
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2 Dimensional 
electron gas 


Semi-insulating GaAs substrate 


Fig. 1—Cross-sectional view of the basic structure of a 
HEMT with a selectively doped GaAs/AlGaAs 
heterostructure. 


tion of drain I-V characteristics, an unexpected 
threshold voltage shift at low temperatures, and 
a highly sensitive and persistent photoconduc- 
tion. We have found that the distortion of 
drain I-V characteristics is related to the type of 
device structure. To eliminate drain current 
collapse at low temperature, we have adopted 
a self-aligned gate structure as shown in Fig. 1. 
The n-AlGaAs layer is completely covered by 
the n-GaAs top layer. There are no exposed 
surfaces at the drain end of the gate. In the 
structure, high energy electrons can easily pass 
through the thin (30 nm) n-AlGaAs layer with- 
out being trapped and can reach the n-GaAs top 
layer; this eliminates the anomalous drain I-V 
characteristics at low temperature. 


2.2 HEMT performance in the submicron dimen- 

sional range 

HEMT has a performance advantage over 
conventional devices. This advantage comes 
from the superior electron dynamics of HEMT 
channels and the unique electrical properties 
of the HEMT structure. During switching, the 
speed of the device is limited by both low-field 
mobility and saturated drift velocity. A low- 
field mobility of 8000 cm?/V-s at 300 K and 
40000 cm?/V-s at 77K is routinely obtained. 
A saturated drift velocity of 1.5 to 1.9.x 
10’ cm/s in HEMT structures at room tempera- 
ture has been reported. These superior transport 
properties of HEMT channels result in a high 
average current-gain-cutoff frequency fy value. 

The logic voltage swing of LSI circuits with 
low power dissipation should be minimized. A 
high transconductance gm with a small logic 
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Fig. 2—Current-gain cutoff frequency vs. gate length of 
experimental HEMT and GaAs MESFET. 


voltage swing is achieved. The transconductance 
Zm in gradual channel approximation is given by 
£m = K (Vcs — Vr), where notations have their 
usual meanings. K is given by K = (eu,Wo/ 
2dL¢), where e€ is the dielectric constant, 
My the electron mobility, Wg the channel width, 
d the spacing between the gate and channel, 
and Lg is the gate length. The K value of a 
0.5-um-gate HEMT at 77 K is calculated to be 
900 mA/V? per millimeter of gate width. This 
K value is about eight times higher than for 
conventional GaAs MESFETs. The smaller level 
of logic voltage swing requires more precisely 
controlled threshold voltages with a smaller 
standard deviation. State-of-the-art standard 
deviation of threshold voltage of enhancement 
mode HEMTs is 11 mW over a full 3-inch 
diameter wafer (described in the following 
section). This value indicates a controllability 
of less than two percent for a logic voltage 
swing of 0.8 V. 

In Fig. 2, the current-gain cutoff frequency 
fr versus gate length summarizes the typical 
performance of experimental HEMTs and GaAs 
MESFETs reported so far'*!®. At room 
temperature, the values of fp were 38 GHz!” 
and 80 GHz!°* for HEMTs with gate lengths of 
0.5 um and 0.25 um respectively and were about 
twice the values for GaAs MESFETs. No signifi- 
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Fig. 3—Dependence of threshold voltage V- yy on gate 
length Lg for HEMT and GaAs MESFET. 


cant variation of threshold voltages with gate 
length was observed in the range from Lg = 
14um to Lg =0.28 um®. This horizontal 
sensitivity indicates that reducing the geometry 
of HEMTs increases performance without 
causing short-channel effect problems. 

The short channel effect is one of the most 
serious problems of applying submicron FET 
devices to integrated circuits. However, the 
HEMT structure has the inherent advantage 
of reducing the short-channel effect. This is 
because the gate-to-channel capacitance can be 
increased by raising the doping concentration 
of the n-AlGaAs layer by using the modulation 
doping technique. This shields the drain fields 
without degrading the semiconductor mobility. 
This will enable easily-designable and stable 
current-voltage characteristics for gates in the 
submicron range. Figure 3 shows how the thresh- 
old voltage varies with gate length for HEMTs 
and self-aligned gate GaAs MESFETs!®). It can 
be seen that the threshold voltage variation 
of submicron gate HEMTs is much smaller 
than that of GaAs MESFETs. The variation of 
HEMTs is less than 30 mV when tlie gate length 
is varied from 1.4m to 0.28 um. Therefore, 
existing HEMTs can potentially allow produc- 
tion of LSI designed with a 0.25 um rule. To 
suppress the short-channel effect in the further 
scaled-down HEMT, the thickness of the AlGaAs 
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Fig. 4—Dependence of K-factor and tranconductance 
&m of E-HEMTs on gate length Lg at 77 K and 
300 K. 


layer must be reduced to raise the aspect ratio 
(Lc /d). Even in this case, the electron mobility 
in the HEMT channel does not decrease with 
increasing doping concentration in the thinned 
AlGaAs layer. This is because electrons in the 
channel are spatially isolated from the doped 
AlGaAs layer. 

Figure 4 shows the dependence of the K- 
factor and transconductance g,, of E-HEMTs on 
gate length at 77K and 300 K. Dashed lines 
show the Lg™! dependence of the K-factor and 
£m expected from the gradual channel approxi- 
mation. Below a gate length of | um at 300 K, 
the K-factor and gm deviate from the Lg”! 
dependence. Velocity saturation effect and 
parasitic source resistances probably play a 
significant role in these results. The 0.5 um 
gate E-HEMT at 300 K hasag, of 330 mS/mm 
with large enough noise margins to allow stable 
LSI operation. 


3. HEMT technology for VLSI 

The development of high-performance VLSI 
requires new technological breakthroughs. This 
section describes the state-of-the-art HEMT 
technology including materials and the self- 


alignment-device fabrication technologies. 


3.1 Material technology for HEMT fabrication 

To grow high-quality material by MBE, we 
optimized the buffer layer between the semi- 
insulating GaAs substrate and the _ two-di- 
mensional electron-gas channel layer. The 
thickness of this layer is 0.6 um. The electron 
mobility in this optimized heterostructure was 
8 x 10? cm?/V-s at 300K. This increases to 
1.2x 105 cm?/V-s at 77K due to reduced 
phonon scattering!??, 

The surface defect problem of MBE is a 
serious one at the LSI-level of complexity”. 
The surface irregularities are called oval defects. 
The oval defects are typically from less than a 
micrometer to several micrometers in size, and 
are comparable in size to LSI devices in circuit. 
These oval defects seriously affect the current- 
voltage characteristics of HEMTs?!). We have 
already achieved a density of less than 10 cm? 
with a size of over 20 um? by optimizing the 
growth conditions. This is required to develop 
an LSI with 10k-gate logic and 64 kbit static 
RAM circuits. 

An important aspect of HEMT LSI fabrica- 
tion is the achievement of highly uniform 
epitaxial wafer growth with high throughput 
and large size. We optimized the geometrical 
configurations between source and_ substrate 
in the molecular beam epitaxy (MBE) system 
and optimized the growth conditions for highly 
uniform epitaxial layers on a 3-inch diameter 
semi-insulating GaAs substrate. These opti- 
mizations resulted in a high throughput and 
high quality. Selectively doped GaAs/n-AlGaAs 
heterostructures were grown on semi-insulating 
GaAs substrates mounted on a subtrate holder 
with a diameter of 190 mm?”). The substrate 
temperature during growth was held at 660 °C. 
To control the threshold voltage of HEMT 
characteristics a uniformity of +1% for the 
thickness and the carrier concentration of the 
AlGaAs layer is required. 

The epitaxial growth of LSI quality material 
with AlGaAs/GaAs heterostructure has also 
been achieved using atmospheric pressure 
OMVPE technology. To achieve the highly 


FUJITSU Sci. Tech. J., 24, 4, (December 1988) 


uniform material characteristics required for 
LSI fabrication, three multi-wafer growth was 
performed using an rf-heated graphite suscepter. 
The stage of suscepter and each wafer rotate 
simultaneously”>). The stage rotates at 8 rpm 
and the wafers rotate at 20 rpm. The Cr-doped 
GaAs substrates were oriented 2.5° off the 
(100) towards the <110>. The source materials 
used in the hydrogen carrier were trimethyl- 
gallium (TMG), trimethylaluminum (TMA), 
AsH3 % and Siz He r 

The growth temperature was 600 °C. The 
growth rate of GaAs was 58 nm/min, and the 
growth rate of Alp..gGag.72 As was 21 nm/min. 
To achieve abrupt interfaces, the composition 
of the source gas was changed instantaneously 
by the ‘‘vent/run’”? method. The uniformity of 
the thickness and carrier concentration of the 
n-AlGaAs film across a two-inch wafer was 
‘better than +2.0% and +1.5% respectively. 
The sheet carrier concentration was 1.1 x 
10'* cnt? and the electron mobility was 
6400 cm?/V-s at 300K for an AlGaAs/GaAs 
selectively doped heterostructure with a 2.5 nm 
spacer. At 77 K the sheet carrier concentration 
was 9.3 x 10'! cm? and the electron mobility 
was 46000 cm?/V-s. Increasing the spacer 
thickness to 7.5 nm increased the electron 
mobility at 77 K to 90000 cm?/V:s. To check 
the quality of wafers grown by this system, we 
used them to fabricate HEMT inverters with 
E-HEMTs and D-HEMTs. The standard devia- 
tions of threshold voltages across a two-inch 
wafer were 23 mV for E-HEMTs and 35 mV 
for D-HEMTs. The transconductance was 
250mS/mm and the current-gain cutoff fre- 
quency was 23 GHz for an OMVPE-grown 
HEMT with a gate length of 0.8 um. These 
values compare favorably with those of an MBE- 
grown HEMT. 


3.2 Self-alignment device fabrication technology 
Figure 5 shows a cross-sectional view of a 
typical self-aligned structure of enhancement- 
mode (E) and depletion-mode (D) HEMTs. The 
structure forms an inverter for a DCFL circuit 
configuration™ . 
The basic epilayer structure consists of 
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Fig. 5—Cross-sectional view of a typical self-aligned 
structure of E- and D-HEMTs forming an inverter 
for DCFL circuit configuration. 


a 600nm undoped GaAs layer, a 30nm 
Alo3Gag7As layer doped with Si to 
2 x 1018 cm, and a 70 nm GaAs top layer 
grown successively on a semi-insulating substrate 
by MBE. The low-field electron mobility was 
found from Hall measurements and was 
7 200 cm?/V-s at 300 K and 38000 cm?/V°s at 
77 K. The concentration of the two-dimensional 
electron gas (2DEG) was 1.0 x 10!? cm? at 
300 K and 8.2 x 10!! cm? at 77K. The AlAs 
mole fraction was 0.3. It can be expected that 
higher AlAs mole fractions will increase the 
maximum achievable concentration of 2DEG 
and will therefore result in an increase in 
transconductance of |= HEMTs. However, 
Al, Ga,-xAs with a high AlAs mole fraction 
exhibits inferior surface morphology and an 
increase in deep traps, making device fabrication 
difficult. A thin Alo3Gao.7As layer acting as a 
stopper against selective dry etching is embedded 
in the top GaAs layer to fabricate E- and 
D-HEMTs on the same wafer. By a dopting this 
new device structure, we can apply the selective 
dry etching of GaAs to AlGaAs and achieve 
precise control of the gate recessing process for 
E- and D-HEMTs. 

Figure 6 shows the self-aligned gate process 
used in the fabrication of E- and D-HEMTs 
forming an inverter for a DCFL circuit configu- 
ration. First of all, the active region is isolated 
by implanted oxygen at 130 keV to a dose of 
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Fig. 6—Basic processing steps for HEMT LSI. 
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Fig. 7—Histograms of threshold voltages for D-HEMTs 
and E-HEMTs over a full 3-inch diameter wafer. 


10’? cm™. This makes a planar structure. The 
source and drain for the E- and D-HEMTs are 
metalized with AuGa/Au to form ohmic con- 
tacts. Next, fine gate patterns are formed for 
E-HEMTs and the top GaAs layer and thin 
Alo.3Gao,7 As stopper are etched off by non- 
selective chemical etching. Using the same resist 
after the formation of gate patterns for D- 
HEMTs, selective dry etching is performed to re- 
move the top GaAs layer for D-HEMTs and to re- 
move the GaAs layer under the thin Alg.3Gao,7As 
stopper for E-HEMTs. Next, Schottky contacts 
for the E- and D-HEMT gates are made by 
depositing Al; the Schottky gate contacts and 
GaAs top layer for ohmic contact are self-aligned 
for high-speed performance. Finally, Ti/Pt/Au 
electrical connections from the interconnecting 
metal to the device terminals are made through 
contact holes etched in a crossover SiON insula- 
tor film deposited by plasma-enhanced CVD. 
This method combines a unique epistructure 
with self-terminating selective dry recess etching 
to enable simultaneous fabrication of super- 
uniform E- and D-HEMTs with the uniformity 
of MBE-grown epitaxial films. The key tech- 
nique for stable fabrication of self-aligned gate 
HEMTs is the selective dry etching of the 
GaAs/AlGaAs layer. Etching characteristics of 
CC1,F2 + He discharges achieved a selectivity 
ratio of more than 260. The etching rate of 
Alo.3 Gao.7 As is as low as 2 nm/min (the rate for 
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Fig. 8—Cross sectional SEM microphotograph of a 
HEMT with a gate length of 0.5 um. 


GaAs is about 520 nm/min). Figure 7 shows 
a histograms of threshold uniformities for E- and 
D-HEMTSs over a full 3-inch diameter wafer. The 
standard deviations of threshold voltages are 
11 mV for E-HEMTs and 14 mV for D-HEMTs. 
The ratio of the standard deviation of threshold 
voltage (11 mV) to the logic voltage swing 
(0.8 V for DCFL) is 1.5 percent. This indicates 
excellent controllability of MBE growth and the 
LSI fabrication process. This strongly recom- 
mends these technologies for fabrication of ICs 
with LSI/VLSI-level complexities. The vertical 
threshold sensitivity is calculated to be 
70 mV/nm?” at a Vy; of 0.13 V and a carrier 
concentration of 2 x 101% cm™3. As shown in 
Fig. 7, the deviation in threshold voltage over 
the wafer for the E-HEMT is 60 mV at a V7 of 
0.28 V. This corresponds to a_ thickness 
deviation of only 1 nm over a 3-inch wafer, 
indicating excellent controllability of MBE 
growth and the device fabrication process. 
Figure 8 shows a cross sectional SEM micro- 
photograph of a HEMT with a gate length of 
0.5 um. 


4. HEMT LSI circuit implementations 

This section reviews and discusses the cur- 
rent implementations and recent advances in 
HEMT logic and memory LSI circuits. 
4.1 Logic circuits 

A HEMT 4.1k-gate gate array with E/D type 
DCFL circuits was designed and fabricated for 
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Fig. 9—Microphotograph of a HEMT 4.1k-gate gate 
array. The array measures 4.8 x 6.3 mm? and 
contains 17692 E- and D-HEMTs. 


use in logic circuits. 

Figure 9 shows a microphotograph of a 
4.1k-gate gate array !), This gate array consists 
of 156 I/O cells and 4096 basic cells. The basic 
cell includes one depletion mode D-HEMT and 
three enhancement mode E-HEMTs with a gate 
length of 0.8 jm. It can be programmed as a 
3-input NOR gate. The cell is 37.5 um x 45 um. 
The gate array consists of 32 columns with 128 
cells in each column. Between the columns there 
are 15 interconnection tracks, each track is 2 um 
wide with a 2 um spacing. The chip of this gate 
array has 100 pads; 72 for I/O signals, and 28 
for power supply. To obtain a sufficient noise 
margin, the Vpp voltage drop and GND voltage 
rise has been minimized by careful arrangement 
of the power supply pads. Therefore the chip 
has a relatively large number of power supply 
pads. The chip measures 6.3 mm x 4.8 mm and 
contains 17 692 devices. The average delay time 
is 27 ps for the inverter and 40 ps for the basic 
gate. The difference between the two values is 
due to crossover capacitance between the gate 
electrode and power supply lines. The basic 
gates are covered with power supply lines in 
order to make the array more compact. 

The 4.1k-gate gate array uses a 16 x 16 bit 
parallel multiplier as a test vehicle. The 
16 x 16 bit multiplier uses 98 percent of this 
array and consists of: registers for a 16 x 16 bit 
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Fig. 10—Comparison of recent GaAs MESFET and 
HEMT multiplier gate propagation delays as 
a function of gate power dissipation. 
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Fig. 11—Basic unloaded propagation delay time and 
power dissipation as a function of supply 
voltage. 


multiplier, 15 half-adders, 210 full-adders, and 
a carry look ahead circuit. A multiplication time 
of 4.1 ns at 300 K, including a 5-stage I/O buffer 
delay, was achieved with a supply voltage Vpp 
of 1.1 V and a total chip power dissipation of 
6.2 W. This is the fastest multiplication time 
ever reported for a 16x 16 bit parallel multi- 
plier. From a simulation using a SPICE Il 
program, we confirmed that, with a fan-out of 
2.6 and a 363 ym interconnection line, the 
multiplication time was about 49 times the 
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Fig. 12—Propagation delay time of a DCFL gate circuit 
as a function of loading conditions. 


typical gate delay of 80 ps. This simulation gave 
an I/O buffer delay of about eight percent of the 
multiplication time. The intrinsic multiplication 
time was 3.8 ns. Figure 10 compares the gate 
delay and the power dissipation of the state-of- 
the-art GaAs MESFET?) 7 and HEMT multi- 
pliers!?)> 27)-30)_ 

Performance of the half-micron HEMT 
DCFL gates was measured by using different 
types of ring oscillators. The basic propagation 
delay time ftpao was 22.5 ps with a power 
dissipation of 3.9 mW/gate and a supply voltage 
Vpp of 2 V. The standard deviation of tpao was 
1.0 ps over an area of 30mm x 30mm. The 
noise margins of the basic inverters were 220 mV 
(Nm_) and 280mV (Nyn). Figure 11 shows 
the dependences of tpq as a function of the 
supply voltage. Figure 12 shows the propagation 
delay time as a function of the loading condi- 
tions for a DCFL gate circuit. The loaded delay 
time (F/I=F/O=3, 7=1mm) is 84ps. The 
measurements of the RAM were made on a 
wafer at room temperature using coaxial probe 
cards. 

A multibit data register circuit using HEMTs 
with a gate length of 0.5 wm has been devel- 
oped?) . The register is designed to synchronize 
data signals for data transfer in systems with a 
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Fig. 14—Microphotograph of a multibit data register. 


high-clock rate. Figure 13 shows the block 
diagram of a multibit data register circuit. The 
clock pulse is applied to each latch through the 
clock chopper. Then, 4x9 bit latched input 
data is transferred to the output ports and 
synchronized by the clock signal. The input and 
output buffers provide signal levels compatible 
with the ECL interface. Figure 14 shows a 
microphotograph of the multibit data register. 
The chip contains 1 137 gates, 99 signal pads, 
and 36 power supply pads; it measures 6.1 mm x 
6.2 mm. This chip was designed to minimize 
differences of the propagation delays from the 
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Fig. 15—Oscillograph showing the delay time through 
the critical path from the clock input to data 
output. 


clock-input to each output. The width and 
spacing of the interconnecting lines are both 
2 um. The logic circuit is 2.4 mm x 2.4 mm and 
includes 3335 HEMTs. The standard supply 
voltages are —2 V and —3.6 V. The speed of the 
multi-bit data register was measured as the delay 
time from clock-input to data-output at room 
temperature using a coaxial probe-card. Figure 15 
is an oscillograph showing the delay time from 
the clock-input to the data output. The delay 
time is 490 ps at room temperature and the 
power dissipation is 4.12 W. The gate delay 
through the path from clock-input to data- 
output was estimated to be 43 ps/gate. 
Performance of HEMT VLSI for future high- 
speed computers is projected and discussed”) >”, 
based on the results with the HEMT perform- 
ance described above. Chip delay time is the 
sum of intrinsic gate delay, logic layout delay on 
fan-out capability, and the delay of the wiring 
on the chip. Chip delays were calculated based 
on experimental data for HEMTs with a gate 
length of 0.5 wm at 300K and 77 K. Here we 
assume that the fan-out is three logic swing is 
0.8 V, wiring capacitance is 100 fF/mm, average 
line length in the chip is 1 mm, and the heat flux 
for liquid cooling is 20 W/cm?. For 10* gates, 
the chip delay is 70 ps at 300K and 40 ps at 
77 K. This sub-100 ps performance is sufficient 
for future high-speed computer requirements. 
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Fig. 16—Microphotograph of a HEMT 1k-word x 4 bit 
static RAM. The array measures 2.6 x 3.0 mm? 
and contains 29 994 E- and D-HEMTs. 


4.2 Memory circuits 

A |k-word x 4 bit static RAM using 0.5-um 
gate HEMT technology was designed and fabri- 
cated!®). Figure 16 shows a microphotograph of 
the RAM. The layout design rule of 1.5 um-line/ 
2.0 um-space was used for interconnections. The 
minimum through-hole is 2 um x2 um. The 
memory cell is 24.5 um x 23 um; this is very 
small for a GaAs memory LSI. The chip meas- 
ures 2.8 mm x 3.0 mm. 

Figure 17 shows the block diagram of the 
lk-word x 4 bit static RAM. The memory cell 
array is divided into four | kbit memory planes, 
each plane has 32 rows and 32 columns. This 
configuration reduces interconnections and 
access time. The RAM has ECL-compatible 
I/O interface circuits and has supply voltages 
of —2 V and —3.6 V. The E/D type DCFL was 
used for the basic logic gate and the memory 
cell. Source follower buffers were used for the 
driver circuits and level shifters. The data line 
equalization technique was adopted to reduce 
the address access time. The address transition is 
detected by the Address Transition Detector 
(ATD). An ATD pulse is generated and used for 
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Fig. 17—Block diagram of a HEMT 1k-word x 4 bit 
static RAM. 


the Data Line Equalizer (DLE), as shown in 
Fig. 17. In the circuit simulation, a propagation 
delay of 22 ps was obtained for the basic DCFL 
gate (F/I= F/O = 1). The simulation used an E- 
HEMT threshold voltage of 0.2 V and a D- 
HEMT threshold voltage of —0.6 V. Using this 
DLE technique, the address access time was 
reduced from 0.68 ns to 0.54ns. The design 
value of the chip power dissipation was 5 W. 
This power dissipation was rather large but can 
be reduced to less than 2 W by using a supply 
voltage of —1 V. 

An address access time of 0.5 ns was achieved 
at room temperature with a chip power dissipa- 
tion of 5.7 W. The chip select access time was 
0.25 ns. Figure 18 is an oscillograph of the 
superimposed 32 address signals and 32 outputs 
(Dout1). The variation of the address access 
time in the 1 kbit memory plane was about 
0.15ns. 

A HEMT 16k-word x 1 bit fully decoded 
static RAM was developed using the E/D-type 
DCFL circuit configuration!» . D-HEMTs were 
used for load devices and E/D-type DCFL cir- 
cuits were used for the basic circuit. The memory 
cell is a six-transistor cross-coupled flip-flop and 
uses switching devices with gate lengths of 
1.2 um. The chip size is 4.3 mm x 5.5 mm, the 
RAM cell size is 23 wm x 30 um (690 um? ). The 
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Fig. 18—Oscillograph of memory address access opera- 
tions. Oscillograph shows the superimposed 
signals of address access in a 1kbit memory 
plane. The access time is 500 ps. The horizontal 
scale is 500 ps/div. 


Fig. 19—Microphotograph of a HEMT 16k-word x 1 bit 
static RAM. The RAM measures 4.3 x 5.5 mm? 
and contains 107519 E- and D-HEMTs. 


RAM has a total device count of 107519. 
Figure 19 shows a microphotograph of the 
16k-word x | bit static RAM. Dynamic perform- 
ance of the HEMT 16kbit static RAM (e.g. ad- 
dress access time) was evaluated both at room 
temperature and at liquid nitrogen temperature. 
The minimum address access time obtained was 
3.4 ns with a chip dissipation of 1.34 W at 77 K. 
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The GaAs MESFFET 4k-word x 4 bit | static 
RAM with 1 ym gate length devices has an 
address access time of 4.1 ns with a chip power 
dissipation of 2.52W at 300 K?*. Sub-nano- 
second address access time can be projected for 
a 64 kbit static RAM using 0.5 um gate device 
technology. 


5. Conclusion 

This paper discusses the current status and 
recent advances of HEMT technology for high- 
performance VLSI with a focus on materials, 
self-alignment device fabrication, and HEMT LSI 
implementations. 

HEMTs are very promising devices for VLSI 
because of their ultra-high speed and low power 
dissipation. A fundamental switching delay 
below 10 ps can be obtained for HEMTs. This 
performance target is suitable for VLSI. The 
gate length dependence of threshold voltage and 
the K-factor of short-channel HEMTs were 
evaluated. Short-channel effects were not found 
to be a problem in microstructures in the sub- 
micron dimensional range. 

Since development of the first HEMT inte- 
grated circuit in 1981, there has been a rapid 
and continuing growth in device, circuit design, 
processing and material technologies. As the 
HEMT technology shifts from the research and 
development phase towards production, new 
developments in material technology or alterna- 
tive growth techniques will be required. These 
developments are needed to enable production 
of highly uniform epitaxial materials of LSI 
quality and for the large quantity production of 
wafers. 

A HEMT 4.1k-gate gate array with a 16 x 
16 bit parallel multiplier has been developed to 
achieve a multiplication time of 4.1 ns. A HEMT 
4 kbit static RAM with an address access time of 
500 ps and a 16 kbit static RAM with an address 
access time of 3.4ns have been developed to 
demonstrate the feasibility of high performance 
VLSIs. We believe that sub-nanosecond access 
operations can be achieved in a 64 kbit HEMT 
static RAM using 0.5 um gate device technology. 
Using the experimental data on HEMT logic, we 
project an optimized chip delay of 70 ps for a 
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10k-gate VLSI at room temperature. This per- 
formance will achieve speeds required for future 
large-scale computers. Based on the results 
described above, it is a certainty that HEMT 
technology will grow into one of the most 
important semiconductor technologies in the 
twentieth century. 

The present research effort is part of the 
National Research and Development Program on 
the “Scientific Computing System’’, conducted 
under a program of the Agency of Industrial 
Science and Technology, Ministry of Inter- 
national Trade and Industry, Japan. 
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This paper describes recent advances in high-speed digital circuits using all niobium (Nb/AIO,/ 
Nb) Josephson junctions. The world’s fastest logic gate Modified Variable Threshold Logic 
(MVTL) is described. The MVTL gate family has been applied to various logic circuits such as 
a 16-bit ALU (Arithmetic Logic Unit) and a 4-bit microprocessor. The high-speed perform- 
ance of Josephson junctions in LSI level circuits has been verified using these circuits. 
A new type of high sensitivity magnetic sensor, SQUID (Superconducting QUantum Inter- 
ference Device), has also been invented. It is called “a single-chip SOQUID’’, because all the 
circuits necessary for its operation have been integrated into a single chip. 


1. Introduction 

Performance of integrated circuits having 
Josephson junctions has dramatically improved 
since niobium junctions have become avail- 
able)». Before niobium junctions, lead-alloy 
junctions were mainly used for integrated 
circuits. Those were the dark ages for researchers 
because circuits seldom worked well. Most of 
the difficulties. originated from the unstable 
characteristics of lead-alloy junctions. 

At the end of 1983, just after the use of 
lead-alloy junctions was abandoned, niobium 
junctions became available for integrated 
circuits. After applying the niobium junctions 
to integrated circuits, various kinds of circuits 
were realized. These circuits operated at a much 
higher speed than those using lead-alloy junc- 
tions. This is because of the small scattering 
of junction characteristics. Thus, the inherent 
performance of Josephson junctions could 
actually be attained. 

At present, we can make LSI level circuits 
which contain several thousand junctions. 
A small-scale Josephson computer can now be 
envisioned. It will operate more than one order 
of magnitude faster than a semiconductor 
computer. In this paper, we describe recent 
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progress in Josephson IC technologies developed 
in our laboratory. Chapter 2 describes fabrica- 
tion technologies for integrated circuits using 
niobium junctions (i.e. Nb/AIO, /Nb junctions). 
The fastest logic gate including semiconductor 
gates is our MVTL (Modified Variable Threshold 
Logic) gate. It is described in Chap. 3. Chapter 4 
describes various high-speed logic circuits. 
A shift registor, 16-bit ALU (Arithmetic Logic 
Unit), 4-bit microprocessor and other circuits 
are described. A large amount of electronics 
has been integrated into a single chip using 
superconducting technology. This single chip 
SQUID (Superconducting QUantum _Inter- 
ference Device) is also introduced in _ this 
chapter. Conclusions are stated in Chap. 5. 


2. Fabrication process 

This section describes the fabrication of 
circuits constructed using Nb/AIO,/Nb 
Josephson junctions. 

The characteristics of niobium junctions 
have recently been greatly improved. The excel- 
lent characteristics of Nb/AlO,/Nb junctions 
were demonstrated by Gurvitch et al.” and 
further improved by Morohashi et al®)»® | Thus, 
we anticipate their use in high-speed digital 
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Table 1. Materials and gases for RIE 


Material Gas for RIE 
Josephson junctions Nb/AIO,./Nb CF, +0, 
Wiring Nb CF, +O, 
Insulation layers SiO, CHF3 + O, 
Resist ors Mo CF, +O, 


circuits and other applications. The controllabili- 
ty, stability, uniformity, and reproducibility of 
Nb/AIO, /Nb are much better than lead-alloys. 
Therefore we have applied the Nb/AIO,/Nb 
junction in the fabrication process of various 
high-speed digital circuits. 

The materials and gases used in the Reactive 
Ion Etching (RIE) technique are listed in 
Table 1. Metals were deposited by DC 
magnetron sputtering in an Ar gas atmosphere 
at deposition rates of 200nm/min for Nb, 
130nm/min for Mo, and 6nm/min for Al. 
SiO, was deposited by RF sputtering in an Ar 
atmosphere at a deposition rate of 8 nm/min. 
The Al film surface was oxidized in ambient 
Ar+10% O, gas at room temperature. The 
typical time required for the oxidation was 
60 min. We can control the critical current 
density from 500 A/cm? to 5000 A/cm? by 
changing the gas pressure from 100 Pa to 40 Pa. 
Thickness of the Mo resistor is typically 100 nm 
and its sheet resistance is 1.5 Q/O. 


3. MVTL gate family 

Josephson logic circuits are usually com- 
posed of OR and AND gates. They are operated 
in a dual-rail manner because there is no 
INVERTER without a timing signal. An AND 
gate is usually driven by the output signals of 
OR gates because an AND gate by itself cannot 
isolate the output signal from the input signal. 
The MVTL gate family we designed consists 
of OR, AND, and 2/3 MAJORITY gates. The 
TIMED INVERTER is sometimes combined 
with an OR gate and another signal junction. 

The MVTL OR gate has an asymmetric 
interferometer and a magnetically coupled 
control line. The control current is injected 
into the interferometer after magnetic coupling. 
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10 wm 


b) Photomicrograph 


Fig. 1—MVTL OR gate. 


Figure la) shows the equivalent circuit of the 
MVTL OR gate. The output current is isolated 
from the injected control current using the 
single junction J3 and resistor R;. Figure 1b) 
is a photograph of the OR gate. This gate has 
three Josephson junctions. The diameters of J, 
and J3 are 2.5 um. The diameter of J, is 4 um. 


The gate size is 31 x 41 um?, The fastest gate 
speed obtained was 2.5 ps for a gate with a 
minimum junction diameter of 1.5 um. The 
power consumption was 17 uW/gate”?. This 
gate is faster than any other logic gate including 
semiconductor gates. The relation between the 
gate delay and the minimum junction diameter 
is shown in Fig. 2. These Josephson logic gates 
can attain a gate delay of less than 10 ps/gate 
without using sub-micron processing technology. 
Figure 2 suggests that gate delay times less 
than 1 ps can be achieved if we fabricate a 
0.6 um minimum diameter junction. 

The AND gate is simply constructed using 
a single junction. It is always driven by the out- 
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Fig. 2—Relation between gate delay and minimum 
junction diameter for MVTL OR gate. 
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Fig. 3—Equivalent circuit of unit cell. 


put signals of the OR gates. This is because 
the AND gate has no function to isolate the 
output signal from the input signal. Unit cells 
are combined using two OR gates and an AND 
gate. Figure 3 shows the equivalent circuit of 
the unit cell. The area of the fabricated unit cell 
is 82x 132m? when a minimum junction 
diameter of 2.5 um is used. The gate delay 
of the unit cell was 16 ps when a minimum 
junction diameter of 4 um was used, and 11.5 ps 
for a minimum diameter of 2.5 um®), 

In an LSI logic circuit, the carry signal of 
a full adder can be obtained using only one 2/3 
MAJORITY gate. The 2/3 MAJORITY gate of 
the MVTL gate family is basically the same as 
the unit cell except that another OR gate is 
connected to junction J,. The operating margin 
of this gate is the same as that of the unit cell 
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Fig. S—Block diagram of 1-bit ALU. 


because the margin is determined by J,. The 
operating speed of the 2/3 MAJORITY gate 
was only evaluated for a gate having a minimum 
junction diameter of 4 um. The gate delay was 
measured to be 21 ps®), 

The Josephson logic gate requires a timing 
signal for inversion. Therefore, an INVERTER 
gate must be constructed using the timing 
signal. This is called a TIMED INVERTER (TI), 
and is constructed as shown in Fig. 4. The TI 
operates correctly only when the signal current 
is applied before the bias current rises. 


4. High-speed logic circuits 

Using the above gate family, we fabricated 
various logic circuits to test the feasibility of 
these gates. They are a 16-bit Arithmetic Logic 
Unit (ALU)”, an 8-bit shift register”, and 
a 4-bit microprocessor”. Performance of these 
circuits is described below. We also fabricated 
a single-chip SQUID. The basic idea of the 
SQUID is also described below. 
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Fig. 6—Photomicrograph of 16-bit ALU. 


Fig. 7—Circuit diagram of 1-bit shift register. 


4.1 16-bit ALU 

We fabricated a 16-bit ALU which performs 
eight arithmetic and four logic functions. 
Figure 5 shows the block diagram of a 1-bit 
ALU. Eighteen unit cells were integrated in 
the block. The unit cell used in this circuit 
is the same as that described in the previous 
section. A multiple bit ALU can be achieved 
by serially connecting multiple blocks of 1-bit 
ALUs. For circuit layout simplicity, no special 
high-speed operation algorithm such as carry- 
look-ahead was adopted in the ALU. Therefore, 
the delay time on the critical path of the ALU 
is the sum of the delays in carry signal propaga- 
tion in the adder mode. 

The 16-bit ALU chip is shown in Fig. 6. 
The circuit size is 0.85 x 8.2 mm?. There are 
900 gates in the ALU, including 36 gates for 
measuring the critical path delay. 

The critical path delay was measured to be 
0.86 ns. The signal path during this operation 
included 83 stages of MVTL OR and AND gates. 
After subtracting the propagation delay of 
the interconnecting lines in the signal path 
which was calculated to be about 95 ps, the 
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Fig. 8—Waveforms of three phase power. 


average gate delay was estimated to be 9.2 ps/ 
gate. The total power consumption of the chip 
was 10.1 mW or an average of 11.3 wW/gate. 

We have also fabricated a 16-bit multiplier 
critical path model®). The model includes 828 
MVTL gates, which are extracted along the 
critical path from the multiplier in order to 
estimate the multiplication time. The observed 
multiplication time was 1.1 ns. 


4.2 8-bit shift register 

We also designed an 8-bit shift register using 
MVTL gates. It has the SHIFT, LOAD, HOLD, 
and CLEAR functions. Figure 7 shows the 
circuit for a 1-bit shift register. S, Z, and A in 
the figure represent the control signals for 
SHIFT, LOAD, and HOLD. DS and DL 
represent the data for SHIFT and LOAD. 

Five unit cells which are the same size as 
those in the ALU and multiplier model and one 
TI in each 1-bit shift register were used. They 
were supplied with three-phase power: ¢,, ¢2, 
and ¢3.The waveform of each phase is sinusoidal 
with DC offsets and the phases are 120 degrees 
apart as shown in Fig. 8. These sinusoidal wave 
forms can be replaced by trapezoidal waveforms. 
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The operating margin is slightly larger for the 
trapezoidal waveforms. However, it is easier to 
provide sinusoidal waves at high frequencies. 

Figure9 shows a_ photograph of the 
fabricated chip. The circuit area is 1.1 x 2.1 mm? 
and it contains 112 gates. The critical current 
density was 1700 A/cm?, while the design 
value was 2100 A/cm*. We confirmed that 
the 8-bit shift register operated correctly for 
all stages of all the control signals using an 80 us 
clock speed. 

High-speed operation was tested for this 
chip. The SHIFT function operated correctly 
up to a 2.3 GHz clock frequency. There is 


Fig. 9—Chip photograph of 8-bit shift register. 


O A Address O Memory I/O 


Data in 


a voltage peak at the clock when the first 
stage is switched due to the voltage change 
of the chip ground induced by control and 
data signals. The total power consumption 
was 1.8 mW. 

We also developed a pseudo random bit 
squence generator’, The circuit is constructed 
using nine stages of the one-bit shift register 
described above, with the output signal being 
fed back to the Sth stage through an exclusive- 
OR gate. Thus it can generate a pseudo random 
number with a 511-bit sequence. We confirmed 
that it operated correctly up to 2.2 GHz. 


4.3 4-bit microprocessor 

We have fabricated a 4-bit microprocessor” 
This is the first instance of applying Josephson 
devices to a microprocessor, so we wanted to 
verify the feasibility of the chip in comparison 
with a typical microprocessor constructed 
using semiconductor devices. We selected chip 
functions that were similar to those of the 
AM2901_ microprocessor manufactured by 
Advanced Micro Devices Inc. This micro- 
processor has come to be regarded as the 
standard 4-bit microprocessor slice. 

Figure 10 shows a block diagram of our 
microprocessor. It has a dual memory set which 
is used as a 16-word by 4-bit two-port RAM 
with a RAM shifter, an 8-function ALU, a Q 
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Fig. 10—Block diagram of 4-bit microprocessor. 
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Fig. 11 —Photomicrograph of 4-bit microprocessor. 


register with a Q shifter, and several control- 
lers. This circuit is driven by three-phase power: 
o,, 2, and $3. Dual-rail logic was adopted 
in the ALU and controllers of the micro- 
processor, and complementary signals are 
generated from the input signals using TIs 
powered by @¢,. Decoding operations are run 
in gates powered by ¢,, reading memory data 
by $2, and modifying and writing data by 
3 - 

Both the minimum junction diameter 
and line width are 2.5 ym. The interconnecting 
lines are 4 um wide. Figure 11 is a photograph 
of the fabricated chip. The basic gate is the 
MVTL as mentioned above, and the total 
number of gates is 1 841. 

All functions and source combinations 
were confirmed at a clock frequency up to 
100 MHz. The clock frequency was limited 
by the maximum clock of the word pattern 
generator. The operation along the critical 
path of the chip was tested using a high-speed 
pulse generator, and confirmed to operate 
correctly up to a clock frequency of 770 MHz. 
The gate power dissipation was 3.6 uW/gate, 
and the total power consumed by the chip 
was 5 mW. 

We verified that the Josephson micro- 
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Table 2. Performance of 4-bit microprocessor 


Device sirotel | GaAs"? | Josephson 
Maximum 
Clock (MHz) | 79 | alee 
Power _ (W) 1.4 22 | 0.005 


note 1: AMD, 1985 data book 
note 2: Vitesse, 1987 GaAs IC Symposium 
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Fig. 12—Circuit of single-chip SQUID. 


processor operated at one-order faster clock 
frequency and consumed three-orders less 
power than a semiconductor microprocessor. 
Performance of AM2901 type microprocessors 
for three different materials are listed in Table 2. 


4.4 Single-chip SQUID 

SQUID is a very high-sensitivity magnetic 
sensor. We expect it to be applied as an image 
sensor in medical and other applications. We 
fabricated a single-chip SQUID”? which contains 
entire circuits such as the pickup coil, SQUID 
sensor, and feedback circuit. Conventional 
RF and DC SQUID magnetometers use analog- 
feedback circuits such as_ lock-in-amplifiers 
that make integration difficult. We introduced 
a digital feedback circuit and a superconducting 
storage loop. These made it possible to integrate 
SQUID onto a single chip. 

The single-chip SQUID magnetometer 
requires only an AC bias, produces a digital 
output using no other electronics and operates 
at room temperature. The output pulse can be 
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Fig. 13—Photograph of single-chip SQUID. 


processed by a digital processor, or can be 
output to a display unit through a counter for 
direct monitoring of the input magnetic field 
waveforms. 

Figure 12 diagrams the circuit. The figure- 
eight pickup coil transmits the magnetic flux 
to be measured to the SQUID sensor through 
two 20-turn integrated coils. The digital feed- 
back circuit was fabricated using a_super- 
conducting storage loop and an interferometer 
as a write gate. The write gate receives a pulse 
sequence and writes a positive or negative flux 
quantum to the storage loop when a pulse 
arrives. Figure 13 shows the chip which uses 
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Nb/AIO,/Nb junctions having a minimum 
diameter of 2.5 um. The chip area is 3.0 mm x 
3.5mm. This includes the pickup coil which 
is 1.1 mm x 3.2 mm. The pickup coil is coupled 
magnetically to the SQUID sensor using two 
20-turn winding coils made of a niobium line of 
2.5 um wide. 

Magnetic flux coupled to the SQUID sensor 
as low as 7x 10°5 $9/,/Hz was measured, 
where 9 is the flux quantum (2.07 x 10715 Wb). 
This corresponds to a magnetic field of 
4.7x 107? T,/Hz, and a _ magnetic field 
gradient of 4.5 x 10°? T/m,/ Hz at the pickup 
coil. The sensitivity can be further improved 
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because we believe the experiment was limited 
by environmental noise, not by device noise. 
In any case, this device is more sensitive than 
magnetocardiograms which can only measure 
several 1071! 7. 


5. Conclusion 

We have described recent advances of our 
Josephson IC technologies. Our technologies 
have progressed rapidly since we changed the 
junction material from lead-alloy to niobium. 
We view the niobium junction like a wild horse, 
that once tamed becomes quite gentle and 
reliable. We can now control this wild horse. 
We have not described it in this paper, but 
Josephson memory circuits are also feasible up 
to 4K bits with a half nano-second access time. 
Thus, we believe there is no problem in develop- 
ing chips containing LSI circuits with ten 
thousands of junctions. The next step is to 
set these LSIs in a refrigerator and demonstrate 
their high performance. 

The present research effort is part of the 
National Research and Development Program 
on “Scientific Computing Systems”’, conducted 
under a program set by the Agency of Industrial 
Science and Technology, Ministry of Inter- 
national Trade and Industry. 
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A STRAM is different from conventional RAMs because it has synchronous operation and 
an on-chip write pulse generator. Three types of STRAMs are presented in this paper. Each 
type is a standard device and has unique features which are useful in various applications. 
A system model using STRAM was evaluated and it was shown that STRAM can improve the 
system level cycle speed to twice that of a conventional RAM. Using already established 
process technology, Fujitsu has developed a 1K x 4 standard STRAM having a cycle time of 
9 ns and 4K x 4 STRAM having a 13 ns cycle time. 


1. Introduction 

With the increasing system speed of high- 
performance data processing equipment, there is 
a corresponding need for high-speed memory 
devices. Improvement of the memory speed has 
mostly been achieved by the introduction of 
new process technology, but this is becoming 
increasingly difficult. Even if a high-speed 
memory device can be developed, it is ques- 
tionable whether the device performance will be 
optimized at the system level. For conventional 
RAMs, timing requirements, including on-board 
signal skew, and the difficulty of generating a 
narrow write pulse under a heavy on-board load 
have made it difficult to improve the system 
level performance as much as the speed of the 
memory devices. 

For these reasons, the ‘‘Self-Timed RAM” 
(STRAM) has been developed as a synchronous 
RAM having a new circuit architecture which 
can improve overall system performance!). The 
STRAM is built using the same process tech- 
nology as conventional RAMs. 

In this paper, the basic structure of the 
STRAM is described in Chap. 2 and the Latch 
and Register are defined in Chap. 3. In Chap. 4, 
three different STRAM configurations and their 
functions are explained based on the informa- 
tion given in Chaps. 2 and 3. Chapter 5 shows 
the advantages of STRAM over conventional 
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RAM by comparing these two types of RAM 
using a system model. The 1K x 4 and 4K x 4 
STRAMs that Fujitsu has developed are intro- 
duced in Chap. 6. 


2. Basic structure 

The basic block diagram shown in Fig. 1 
shows that STRAM differs from a conventional 
RAM in the following ways: 
1) STRAM has a circuit which temporarily 

stores the input and output data 

The input buffer gate of each input of the 
conventional RAM: Address input (ADD), Data 
input (Dyn), Chip Select input (CS), Write 
Enable input (WE), is replaced by a data store 
circuit in the STRAM. For output, STRAM also 
provides a data store circuit in front of the out- 
put buffer gate. 
2) STRAM has an on-chip write pulse generator 

Due to the internally generated write pulse, 
it is no longer necessary to externally control 
the write pulse width using the WE input. WE in- 
put only provides state information to the RAM 
whether it is in the read cycle (WE = high level) 
or write cycle (WE = low level). 
3) STRAM has a clock (CLK) input 

The data store circuit and internal write 
pulse generator for articles 1) and 2) above are 
controlled by the clock (CLK) input. STRAM 
has synchronous read and write cycles. 
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Fig. 2—Latch (Level sensitive type latch). 


STRAM variations based on the type of data 
store circuit are described in the following sec- 
tions. 


3. Definition of latch register 

The data store circuits shown in Fig. 1 can 
be a latch type or register type. 

These two types are described below. 
1) Latch 

The latch defined here is a D-latch type or 
“level sensitive” type latch. Figure 2 shows that 
the input data (D) is controlled by the level of 
the LD input. D is transparent to the output (Q) 
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Fig. 3—Register (Edge sensitive type latch). 


when the LD input is low (L). The latch is 
closed and D cannot pass through the latch 
when the LD input is high (#/). 

The LD input is controlled by the clock input. 
Timing references between the external clock 
input (CLK), input data (D), and output (Q) are 
shown in Fig. 2. ts is the setup time and fy is 
the hold time of input (D) with respect to the 
rising edge of the CLK input. These provide the 
required conditions for the latch to hold the 
assigned input data. 

2) Register 

The register defined here is a pair of latches 
connected in series. One latch is controlled by 
the LD input and the other by an inverted LD 
input. Figure 3 shows the timing references 
between the CLK input, input (D), and output 
(Q). The register is edge sensitive and controlled 
by the edge of the CLK input. Therefore, output 
(Q) remains stable throughout the cycle (tcyc) 
for the register, unlike the data-through mode 
for the latch. 

These latch and register structures are 
advantageous for chip layout because the latch 
or register can be easily built-in using metal 
option technology. Therefore, the different 
STRAMs explained in later chapters can be 
easily manufactured. 


FUJITSU Sci. Tech. J., 24, 4, (December 1988) 


4. Types of STRAM 

Various types of STRAMs can be manu- 
factured depending on the type of input and 
output data store circuits. Use of the latch or 
register explained in the previous chapter is 
optional for the input and output data store 
circuits. Figure 4 shows three typical STRAMs 
that are described in this chapter. Figure 5 
shows the timing charts of these STRAMs. 

1) LL-mode 

In the LL-mode STRAM, latches are used 
for both the input and output data store circuits. 
The latches are controlled by the internal clock 
signal. The clock signal to the output latch is 
inverted. Table 1 lists the input and output 
latch functions with respect to the CLK input. 
This table shows that the input latch and the 
output latch operate opposite to one another. 
For example, during the high CLK input state, 
the input latch is closed, the output latch is 
transparent, and data is read out at the output. 
Therefore, any change in the input state does 
not influence the output data. During the low 
CLK input state, the input latch becomes trans- 
parent, and data from the memory cell tries to 
pass through to the output. However, data read 
out does not occur before the next high level 
CLK input because the output latch is closed 
during that period. 

A feature of the LL-mode STRAM access 
mode is that output data can appear at the out- 
put independent of the clock edge when the set- 
up time for address inputs is controlled to a 
relatively small value. This through-mode access 
(ta(App)) shown in Fig.5 is the same as the 
address access time of a conventional RAM. 

In the write cycle, write operations must be 
completed during the high CLK input state only 
when the address data is fixed in the latch as ex- 
plained in Chap. 3. Both read and write opera- 
tions of the LL-mode STRAM are performed 


Table 1. Input and output latch operation in LL-mode 


STRAM 
CLK input Input latch Output latch 
a Transparent | Closed 
‘H’ Closed Transparent 
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Fig. 5—Timing charts of three types of STRAM. 
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Table 2. Summary of features of three types of STRAM 


Types LL-mode RR-mode RL-mode 
Items 
Min required 
time to 
guarantee 
complete 
: read or write . . 
Clocking operation so ratio ae ratio 
during high 
CLK input 
high state. 
Address Data is | Data is 
access mode | available in available in 
identical to | the next | the same 
the conven- | CLK cycle, CLK cycle, 
and cole tional RAM | but high- but it is an 
y address speed access | access mode 
access time from the | from the 
(taa) is CLK edge is | CLK edge. 
possible. imple- 
mented. 
Write Write Write 
operation is | operation operation 
, executed canbe can be 
Write cycle only for a executed executed 
| CLK high through out | through out 
state. the cycle. the cycle. 


during the high CLK input state. Thus, there is a 
minimum required time for the CLK input high 
duration (twH(CLk))- 

2) RR-mode 

The RR-mode STRAM uses registers for 
both the input and output data store circuits. 
A feature of the register, as stated in Chap. 3, 
is that both holding the input data and reading 
the output data are controlled by the CLK 
input edge, but the output data corresponding 
to specific input data does not become available 
due to the same CLK edge. In a RR-mode 
STRAM, the read out data is available in the 
next cycle as shown in Fig. 5. High-speed read 
operation is enabled because there is only a 
delay in the output register (fpr) without 
going through the memory cell array. 

In the write cycle, unlike the LL-mode, 
there is no minimum required time fortwH(cLk) 
to guarantee the complete write operation 
because input data remains stable throughout 
the cycle. 

As stated above, duty ratio free CLK input is 
enabled in the RR-mode STRAM because read 
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and write operations are initiated by the CLK 
input edge. 
3) RL-mode 

The RL-mode STRAM uses a register for the 
input data store circuit and a latch for the out- 
put data store circuit. Holding the input data 
and the write cycle is the same as for the RR- 
mode. It is different from the RR-mode STRAM 
in that output data is available in the same CLK 
cycle because the output latch is transparent 
during the low CLK input state. This is shown in 
Fig. 5. 

Duty ratio free CLK input is also enabled in 
the RL-mode STRAM as in the RR-mode 
STRAM. 

The main features of these three types of 
STRAMs are summarized in Table 2. 

5. Comparison between conventional RAM 
and STRAM in a system model 

To verify the advantages of STRAM over a 
conventional RAM on the system level, they 
were both applied in the system model shown 
in Fig. 6 and evaluated. Figure 6 shows a system 
model in which several RAM arrays are con- 
trolled by the CPU. The CPU driver generates 
an Address signal, CS signal, Dry signal, and 
WE signal which are conveyed to each RAM 
array. These signals are generated synchronously 
by the system clock signal. Read out data from 
each RAM array is returned to the CPU and is 
held in the latch. This chapter compares the 
conventional RAM and STRAM for read cycle 
performance and write cycle performance when 
each device is used as the RAM array in this 
system. The LL-mode STRAM was used in this 
comparison. The same comparison can be 
made using the other two types of STRAMs. 
For simplification, clock skew and skew 
between the system clock and STRAM clock 
are ignored here. Only the essential signals 
required to understand system operation are 
considered. 

1) Read cycle for conventional RAM 

Figure 7 shows the timing diagram when 
conventional RAM is used in the system model. 
Address signals forwarded by the system clock 
run along the signal transmission paths to reach 
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Fig. 7—System read cycle for conventional RAM. 


a RAM array in fpp(min) for the fastest case and 
tpp (max) for the slowest case. 

This propagation distribution mainly occurs 

due to the following factors: 

i) Each RAM has a different length signal 
transmission path from the CPU. 

ii) Input capacitance of the RAM varies be- 
tween the maximum and minimum 
values. 

iii) Speed of the CPU latch and CPU driver 
also have maximum and minimum values. 

iv) The unit speed of the signal transmission 
path itself has a certain distribution. 

by the system clock, RAM output data becomes 
valid after the skew between the RAM minimum 
access time (ta(min)) corresponding to the 
fastest address signal (tpp(min)) and the RAM 
maximum access time (ta(max)) Corresponding 
to the slowest address signal (tpp(max))- In Fig. 7, 
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ta(min) iS assumed to be zero for simplification. 
After the RAM output becomes valid, it must 
be held at the output for a certain period of 
time so that the CPU can latch the data. This 
time is called tyoLp.: 

The system cycle time (tcyccsys)) for 
conventional RAM is expressed as follows. 

tcoyc(sys) = fpp(skew) + fA(max) + SHOLD, 
where fpp(skew) is the skew of signals transmitted 
in the system and is given by tpp(max) — fPD(min)- 

As an example, we applied the following 
assumptions to estimate the actual read and 
write cycle times in our system. 

tpp(skew) = 10 ns: 

The transmission skew from the CPU to each 

RAM and from each RAM to the CPU is 

assumed to have the same value. 

ta (max) = 10 ns: 

A RAM having an access time of 10 ns is 

assumed. 

tyoLp = 13 ns: 

A RAM output valid time of 3 ns is assumed 

for the CPU to latch the data from each 

RAM. Thus, a data hold time of 13 ns 

is required for the RAM because of the 

previously assumed 10 ns transmission skew 

(tpp(skew)) from the RAM to the CPU. 
Based on these assumptions, the system cycle 
time is as follows. 

tcycisys) = 10 ns + 10 ns + 13 ns = 33 ns. 
Although these values partly depend on each 
system design, this result implies that RAM 
having an access time of 10 ns is degraded to 
about a three times slower cycle time in the 
system. 
2) Read cycle for LL-mode STRAM 

The system cycle time when a STRAM is 
used in the same system is evaluated below. 
Figure 8 shows the timing diagram. The address 
signals are conveyed to the STRAM with the 
same skew as assumed for the conventional 
RAM. After the STRAM clock edge is inserted 
within the required setup time (fs) and the 
address data is latched in the STRAM, RAM 
output data is read out within the address 
access time during the high CLK input state. 
When the CLK input goes low after fAa(max), 
(i.e. after the output data becomes valid) the 
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Fig. 8—System read cycle for LL-mode STRAM. 


data remains on the output for the data hold 
time (tyoLp) because the output latch is closed. 
When the CLK input is high, the input latch is 
closed. Thus, address signals can be changed to 
prepare for the next address after the required 
hold time (fy ) expires. 

Based on the functions described above, 
the system cycle time for the STRAM is 
expressed as follows. 

tcyc(Sys) = fa(max) + ‘HOLD — ¢s; 
or, 

fcyc(sys) = ts + ty + tppiskew)- 

The equation having the larger value domi- 
nates the cycle time. A smaller value of ty can 
shorten the cycle time as indicated by the latter 
equation, fs affects the cycle time calculation 
in opposite ways for the two equations. If we 
assume fyH(min) = 2.ns and ¢ta(max); ‘HOLD and 
tpp(skew) have the same values as for the con- 
ventional RAM, we can obtain the optimum 
ts which minimizes the cycle time for both 
equations. That is, 

ts =5.5 ns. 

Using this value, the system level cycle time 
for the LL-mode STRAM is as follows. 

tcyc(SYSs) = 10 ns+ 13 ns — 5.5 ns=17.5 ns. 

As described above, using STRAM can 
result in a faster system cycle time than conven- 
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tional RAM under the same system conditions. 
Although these two types of RAMs have similar 
system access time (ta sys)) aS shown in Figs. 7 
and 8, STRAM can improve the system cycle 
time due to the timing overlap feature that 
enables the address to be changed before system 
access becomes available. 
3) Write cycle for conventional RAM 
Figure 9 shows the write cycle timing diagram 
for conventional RAM when address signals and 
the WE signal are transmitted to each RAM with 
the same amount of signal skew. It is well known 
that for write cycle timing in a conventional 
RAM, the required conditions for the address 
signal setup time (fga(min)) and hold time 
(tya(min)) With respect to the WE signal and 
minimum pulse width (twwmin)) of the WE 
signal must be guaranteed. To meet these 
conditions in the system, the following timing 
conditions must be met (see Fig. 9): 
i) tsa(min) during tpp(min) Of WE signal after 
tpp(max) Of the address signal 
ii) tHA(min) during tppqmin) of the address 
signal after tpp(max) of WE signal. 
ili) CW W (min) between fpp(min) and tPD (max) 
of WE signals 
The system cycle that satisfies these condi- 
tions is expressed as follows. 
tcyc(Sys) = tcyc(device) + 3 X tpp (skew); 
the tcyc(device) is the write cycle time of each 
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Fig. 11—1K x 4 STRAM chip (4.5 mm x 3.5 mm). 


RAM. It is given by tsa(min) + twwcmin) + 
tHyA(min). Let us assume that tcyc(device) iS 
10 ns. 

tcyc(sys) = 10ns+3 x 10 ns= 40 ns. 

This indicates that the system level cycle 
time can be as much four times that of the device 
level. As the device performance is improved, 
the ratio of signal skew in the total system cycle 
time becomes larger. The other problem asso- 
ciated with the write cycle time of high-speed 
conventional RAM is the difficulty in generating 
a narrow write pulse width under a large load 
in the system. Even if this is possible, it is 
very expensive. 

4) Write cycle for LL-mode STRAM 
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Fig. 12-4K x 4 STRAM chip (6.1 mm x 4.5 mm). 


The timing diagram shown in Fig. 10 shows 
that the skew of the address signals and WE 
signal can overlap because the WE signal is 
held at the STRAM CLK input edge in addition 
to the address signals. Immediately after receiving 
the low WE signal, the internal pulse generator 
automatically starts operating to guarantee an 
internal write pulse that satisfies the required 
conditions for the internal setup time and hold 
time with respect to the address signals. These 
operations are implemented during the CLK 
input high state with respect to the internal 
Address and WE timings as shown in Fig. 10. 
The LL-mode STRAM write operation can be 
completed within the same CLK input high 
state period as the read cycle because a general 
characteristic of RAM devices is that tww min) 
is almost the same as ta(max). This means that 
STRAM enables a write cycle time equivalent 
to the read cycle time. Using the values pre- 
viously obtained as the STRAM read cycle time, 
tcycc(sys) of the STRAM write cycle can be 
expressed as follows. 

tcyc(sys) : WRITE = tcyc(sys) : READ 

= 17.5 ns. 

As mentioned before, use of STRAM reduces 
the system level write cycle time to less than 
half that of conventional RAM. 


6. Development of 1K x 4 and 4K x 4STRAM 
Figure 11 is a die photo of a 1K x 4STRAM 
and Fig. 12 is a die photo of a 4K x 4 STRAM. 
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a) IK x 4 STRAM b) 4K x 4 STRAM 


Fig. 13—Pin assignments. 


Based on this die, three variations for the LL-, 
RR-, and RL-modes can be manufactured using 
metal option technology. Both 10K ECL I/O 
and 100K I/O can be supported for each mode. 
Figure 13 shows that the pin configuration is 
common to each mode. Expansion from IK x 
4 to 4K x 4 is possible. 

Two CLK input pins can implement high- 
speed clocking by using the CLK input and 
CLK input simultaneously in the differential 
mode. Single ended mode of the CLK input or 
CLK input is also possible by connecting CLK or 
CLK to the internal reference voltage (VBB). 

The process technologies used are the 
currently mature IOP-II (Isolation by Oxide and 
Polysilicon) technology and 1 ym lithography. 

The main characteristics for the LL-mode 
STRAM are listed in Table 3. 
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Table 3. Main characteristics of 1K x 4 and 4K x 4 
LL-mode STRAM 


1K x 4 4K x 4 


Parameter Symbol STRAM STRAM 
Cycle time tcyc 9ns min | 13 ns min 
Clock pulse : . 
width high twH(CLK) | 6 nsmin | 10 ns min 
Address access time Lian) 7 ns max | 10 ns max 


Power supply 


current TEE —380 mA min 


7. Conclusion 

This paper introduces STRAM as a RAM 
having a new circuit architecture that can 
provide higher performance in the system than 
conventional RAM. This is achieved by adding 
relatively simple on-chip latch or register circuits 
and a write pulse generator using the same 
process technology. A system model is used to 
show that STRAM can improve the system cycle 
speed by more than twice that of conventional 
RAM. Thus, we can expect that STRAM will be 
widely used as a standard device in the future 
as a substitute for conventional RAM. STRAM 
will gain a reputation as an indispensable tech- 
nology especially for higher-speed RAM devices 
because STRAM can avoid signal skews that are 
becoming a major factor in limiting the improve- 


ment of system performance’? 
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This paper discusses the three-dimensional stacked capacitor (3D STC) cell technology that 
Fujitsu used in 1-Mbit DRAMS (Fujitsu was the first to do this), and the development of 
1-Mbit and 4-Mbit DRAMs using the 3D STC technology. 3D STC technology is the key to 
cell area reduction enabling densities higher than 1-Mbit. This technology provides mass 
production capability and a high immunity to alpha-particle-induced soft errors. To respond 
to market demands for low power consumption, high speed, and high reliability, 1-Mbit 
DRAMs were designed using CMOS technology. A 4-Mbit DRAM having an access time of 
56 ns and low power consumption of 175 mW was also developed. 


1. Introduction 

Since the 1-Kbit dynamic memory (DRAM) 
was developed, the density and performance of 
MOS DRAMs have steadily improved and have 
led the semiconductor technologies of Fujitsu. 
In 1985, a three-dimensional stacked capacitor 
cell was developed” and first used in a 1-Mbit 
DRAM. 

The three-dimensional design of a capacitor 
using this cell technology results in a large cell 
capacitance in a very small cell area and high 
scalability. These features have attracted much 
attention to this cell technology and it is being 
widely used for 4-Mbit DRAMs?*). This 
report discusses the three-dimensional stacked 
capacitor cell (3D STC) technology and the 
1-Mbit and 4-Mbit DRAMs that use this tech- 
nology. 


2. Development of DRAMs having capacities 

up to 256 Kbits 

Figure 1 shows the DRAM developments by 
Fujitsu. Since the development of the 1-Kbit 
DRAM in 1971, integration has quadrupled 
every three years. A 4-Mbit DRAM may be 
introduced to the market in 1989. The major 
points of this steady progress in high integration 
is reviewed below. 

Figure 2 shows that the cell area has been 
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reduced by a factor of 400 in the last 18 years. 
This area reduction was mainly due to the 
progress in fine lithography techniques, cell 
structure, and circuit technology. The standard 
DRAM design rule for fine lithography has been 
to make the lithography 0.7 times finer in 
each generation. The 4-Mbit DRAM must 
now be processed in units of submicrons. In 
each generation of DRAM, a cell area reduction 
technology has been developed (see Table 1). 

The general progress of DRAMs can be 
viewed as the advance of memory cell tech- 
nology. The memory cell of a 1-Kbit DRAM 
consists of three (or four) transistors (see 
Fig. 3a)”. Although this cell has a large area, 
it has the advantages of a current amplification 
capability and is easy to read. In the single- 
transistor cell system that was first incorporated 
in the 4-Kbit DRAM, a one-bit memory cell 
consists of a transistor as a switch and a capaci- 
tor that stores information as an electric charge 
(see Fig. 3b). 

Because the single-transistor cell does not 
have current amplification capability, the 
signal voltage on a bit line is very small (100 mV 
to 200 mV). Advances in circuit technology, 
including the sense amplifier, has enabled this 
small signal to be detected at high speeds and 
has provided the basis for DRAMs ever since. 
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Table 1. Development of DRAM technology for each generation. 


Lithography technology 
ed Design rule Cpitne Cell structure Circuit technology 
um) 
1K 9 Wet 3 transistors pMOS dynamic 
4K 8 Wet 1 transistor nMOS dynamic 
16K 6 Wet Double poly-Si MPX add. 16 pin PKG 
64K S Dry Double poly-Si Single 5 V supply 
256K 2.3 Dry Triple poly-Si Redundancy, high speed 
1M 1.8 Dry 3D STC cell CMOS dynamic 
4M Dry 4 layer poly-Si STC Blocknized peripheral 


Capacity (bit/chip) 


1970 1975 1980 1985 1990 
Year 


Fig. 1—DRAM development of Fujitsu. 


The following section explains the principle of 
operation of the sense amplifier for the single- 
transistor cell using the circuit of the 256-Kbit 
DRAM MB81256” as an example. 


2.1 Sense amplifier for single-transistor cell 
Figures 4 and 5 show the major circuits of 
the sense amplifier and their operating wave- 
forms. A sense amplifier consists of a dynamic 
flip-flop circuit having a pair of transistors (Q 
and Q2). A small differential voltage between 
the left and right bit lines is quickly amplified. 
That is, sense amplifier activation clocks A and 
B are set to a high level sequentially at time f, 
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Fig. 2—History of cell area reduction (Cell area reduced 
by a factor of 400 in 18 years). 


(see Fig. 5), and the differential voltage between 
nodes N, and Ny, increases. At time f, after 
amplification, the active restore circuit operates 
to recharge the bit line to the high level V,,, and 
the series of read operations is completed. 

The cell read signal voltage can significantly 
affect the stability of sense amplifier operation. 
This is analyzed in a simplified manner below. 
In Fig. 4, the bit line capacitance is Cpy, cell 
capacitance is Cs, dummy cell capacitance is 
Cp, and the cell potential at time to is V;. The 
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Fig. 3—DRAM memory cell circuit. 
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Fig. 4—Sense amplifier circuitry of 254-K DRAM. 


potential difference (or signal voltage) Vig 
between sense amplifier input nodes N, and N, 
at time f, is determined by the ratio of cell and 
bit line capacitances and is expressed as follows. 


Vsig ee = Wah FF 
a 
Ga ths co’ Y= /n- 


In the above expression, y is the cell read 
efficiency and V, is the noise voltage to the 
sense amplifier. From the above expression and 
the MB81256 cell capacity, the relation between 
cell potential V, and signal voltage Vij, is 
obtained as shown in Fig. 6. In this example, 
the reference voltage is set to 2 V and the read 
margin at the high level is set above the read 
margin at the low level because of the leakage 
at the p-n junction and the cell charge loss due 
to alpha particles”. 


2.2 DRAMs and cell technology before 1-Mbit 


DRAM 
The 64-Kbit DRAM can be operated by a 
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Volfage (V) 


Time 


Fig. 5—Operational waveforms of sense amplifier. 


Sense amplifier margin 


Fig. 6—Relation between Vee; and Vsig (Dummy cell 
is adjusted to fit Vrep =2 V). 


single 5 V power supply while conventional 
DRAMs require three power supplies (+12 V 
+5 V). This design enabled the DRAM to be 
connected to peripheral circuitry more easily 
and expanded the field of DRAM applications 
from main storage in large computers to person- 
al computers. This resulted in a dramatic increase 
in the demand for DRAMs. nMOS technology, 
which had been used for DRAMs having ca- 
pacities of 4 Kbits to 256 Kbits, was replaced 
by CMOS technology when 1-Mbit DRAMs 
were developed to reduce power consumption 
and increase speed. 

The objective of cell structure development 
at this time was the promotion of multilayer and 


303 


T. Nakano, and T. Yabu: 3D Stacked Capacitor Cell for... 


three-dimensional designs. For DRAMs having 
4 Kbits or less, the capacitor and transistor 
consist of a single poly-silicon layer whose 
capacitor area occupies only a small portion of 
the cell area. Double poly-silicon layer tech- 
nology was first used for a 16-Kbit DRAM. 
The double poly-silicon layer structure provides 
the first-layer for the capacitor electrode and 
the second layer for the transistor gates. This 
increases the capacitor area occupancy ratio. 
In addition, the function of each poly-silicon 
layer can be limited, enabling the optimum 
gate oxide thickness for the capacitor and 
transistor to be selected individually. Thus the 
maximum capacitance can be provided in a very 
small cell area. 

The multilayer poly-silicon design was 
further advanced. The resulting three-layer 
poly-silicon technology developed for the 
256-Kbit DRAM has increased the speed even 
more. This technology provided the basis for 
the smooth development of 1-Mbit  three- 
dimensional stacked capacitor (3D STC) cells. 
The cells of DRAMs having 256 Kbits or less 
use the surface of the silicon substrate for the 
capacitor and transistor and are classified as 
planar cells. If the cell area was reduced by 
using only fine lithography together with the 
planar technique to obtain capacities of 1 Mbits 
to 4 Mbits, the cell capacity required to guaran- 
tee immunity from alpha-particle-induced soft 
errors could not be achieved. To solve this 
problem, the  three-dimensional design was 
employed based on the concept of a stacked 
capacitor cell which overlays the capacitor on 
the transistor for efficient use of the silicon 
surface. 


3. Stacked capacitor cell technology 
3.1 Features of mega bit DRAM cells 

Various cell structures have been proposed 
for mega bit DRAM memory cells having three- 
dimensional structures. The stacked cell forms 
a capacitor on a single-transistor cell access 
transistor. The trench cell forms a capacitor 
in a trench dug in the silicon substrate. Many of 
the suggested cell structures were trench cell 
types, but planar cells, which were mainly used 
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for DRAMs having capacities of 256 Kbits or 
less, are generally used for the 1-Mbit DRAMs 
now in mass production. 

However, the planar cell is reaching the 
limit of its capability. When the future capacity 
of DRAMs is considered, it is now necessary to 
select other cell types as the memory cells for 
4-Mbit to 16-Mbit DRAMs. Recently, the 
developments in stacked capacitor cells have 
gained attention for their applicability to fine 
lithography and high scalability. 

Fujitsu led other manufacturers by develop- 
ing the three-dimensional stacked capacitor 
cell and using it for 1-Mbit DRAMs. Fujitsu 
subsequently developed a memory cell for a 
4-Mbit DRAM having the smallest cell area 
reported so far using fine lithography tech- 
nology. Furthermore, Fujitsu. has promoted 
the development of stacked capacitor cell 
technology combined with a  dielectrically 
encapsulated trench (DIET) capacitor cell” 
which combines the advantages of trench cells 
and stacked cells. 


3.2 Three-dimensional stacked capacitor cell 

The memory cell structure is most important 
for the design of a DRAM. The memory cell 
almost determines the performance and mass 
producibility of the DRAM. The memory 
cell size of the 1-Mbit DRAM must be reduced 
to about one-third that of the 256-Kbit DRAM. 
The memory cell size of the 4-Mbit DRAM 
must be reduced to about one-third that of 
the 1-Mbit DRAM by using a scaling factor 
of 0.6 to 0.7. When Fujitsu developed the 
1-Mbit DRAM, it planned to develop a basic 
structure for the memory cell that could be 
used for at least two generations. After investi- 
gating various memory cell structures such as the 
stacked cell, trench cell, and planar cell, Fujitsu 
selected the stacked capacitor cell structure due 
to its high scalability and thus the expandability 
to 4-Mbit DRAMs. 

3.2.1 Folded bit line configuration 

The basic idea of a stacked cell has existed 
since 19788). This idea, however, simply 
stacks the capacitor on the access transistor of 
the cell and has an open bit line configuration. 
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Fig. 7—3D stacked capacitor cell. 


Fujitsu has also used the open bit line 
configuration for its 64-Kbit and 256-Kbit 
DRAMSs. However, mega bit DRAMs which 
have reduced memory cell size require a cell 
structure that enables the folded bit line 
configuration for an improved noise margin. 
Fujitsu has improved the conventional stacked 
capacitor cell structure by locating word lines 
under the second polysilicon layer that forms 
the charge storage electrode (see Fig. 7). This 
structure forms a memory cell at every other 
intersection of a bit line and a word line and 
enables the folded bit line configuration. 

3.2.2 Cell size 

A planar cell forms a flat capacitor on the 
surface of a substrate. The capacitor area is 
reduced in proportion to the reduction of 
memory cell size. Even when the fine lithog- 
raphy technique is fully implemented, it has a 
limited ability to provide a large capacitance 
in a small area. Because a trench cell has a 
trench in the substrate in which the capacitor is 
formed, the capacitor is also formed on the 
surface of the side walls within the trench. 
If the trench is deep, a relatively large memory 
cell capacitance can easily be provided. 

A stacked cell forms the capacitor on the 
access transistor. Therefore, the memory cell 
capacitance can be increased because the 
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Fig. 8—Comparison of cell size for three types of cell 
structure. 


capacitor is formed on the top and sides of the 
polysilicon layer for storage node. The bent 
shape of the storage node also contributes to the 
capacitance increase. 

The memory cell areas of the stacked cell, 
trench cell, and planar cell were compared when 
the same lithography technique was used and 
the capacitor areas were the same (see Fig. 8). 
The results show that the three-dimensional 
stacked capacitor cell is the best for reducing the 
memory cell size. 

3.2.3 Soft error immunity 

A soft error is an event in which cell infor- 
mation is destroyed. The charge generated by an 
alpha particle beneath the charge storage region 
of the cell is absorbed in the diffusion layer, 
and the voltage potential of the cell is lowered 
causing cell information to be destroyed. 

The first method to prevent soft errors is 
to suppress the generation of alpha particles by 
increasing the purity of the package material 
or by preventing alpha particles from entering 
the silicon substrate. The second method is to 
increase the charge storage capacitance to reduce 
the adverse effect of the charges generated by 
the alpha particles. The third method is to 
design a memory cell structure having high 
resistance to soft errors by lowering the charge 
collection efficiency of the diffusion layer. 
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Fig. 9—Comparison of SER for three types of cell 
structure. 
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Fig. 10—Dependence of collected charge on junction 
area. 


To increase the storage capacitance, the 
thickness of the capacitor film must be reduced 
and the storage electrode areas must be in- 
creased. To lower the charge collection efficien- 
cy, the diffusion layer area must be reduced or a 
potential barrier must be formed. For example, 
a HiC structure!) or memory cell formation in 
a p-well!” is required. 

When the stacked capacitor cell was de- 
veloped, three test devices having the stacked 
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capacitor cell, planar cell, and trench cell struc- 
ture were made and actual soft error rates were 
measured (see Fig. 9). The results show that the 
stacked cell causes fewer soft errors even though 
it has a small memory cell capacitance. 

Because the capacitor is formed on poly- 
silicon in the stacked cell structure, its diffusion 
area is very small and the collected charge 
amount is reduced. On the other hand, the 
capacitor area in a planar or trench cell is 
equivalent to the diffusion layer area and the 
edge of the drain is added to this area. The 
diffusion layer in the charge storage region 
is therefore enlarged and the collected charge 
amount become large. Although a large memory 
cell capacity can theoretically be maintained by 
the trench cell structure, the critical charge 
amount must be increased because the diffusion 
layer area increases according to the increase in 
the capacity. For this reason, an additional 
countermeasure, including a potential barrier 
on the side walls of the capacitor, is required. 

This characteristic can also be illustrated 
by the results of an experiment in which col- 
lected charge amounts are measured using 
test devices having various junction areas (see 
Fig. 10). If the junction area is small in com- 
parison to the charge amount to be collected, 
the collected charge amount is reduced be- 
cause the effective funneling length is shortened 
by the electric field distortion at the junction 
edges, and because adjacent cells partially 
absorb the charge. 

3.2.4 Charge retention characteristic 

The charge retention characteristic of the 
memory cell is important in relation to the 
refresh time of the DRAM. In a 256-Kbit 
DRAM, the refresh time is 256 cycles/4 ms. 
That is, 1024 memory cells are refreshed by 
one refresh operation and the operation must be 
executed 256 times in 4 ms. Ina 1-Mbit DRAM, 
the refresh time is 512 cycles/8 ms, and the 
refresh time of a 4-Mbit DRAM is 16 ms if the 
refresh overhead time is the same as that of the 
1-Mbit DRAM. The refresh time doubles for 
each DRAM generation. 

To prolong the charge storage time, all 
sources of leakage current must be reduced. 
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Fig. 11—Threshold voltage as a function of gate length. 


When compared with other cell structures, 
the stacked capacitor cell has a small p-n junc- 
tion area of the capacitor and lower leakage 
current. The stacked capacitor cell can also 
incorporate conventional isolation techniques, 
resulting in sufficient isolation width and a 
lower leakage current. 

The leakage current of the capacitor di- 
electric film on the poly-silicon is not more 
than 107'© A per cell when the film thickness 
is 5nm (effective oxide thickness) and the 
electric field in the insulating film is 5 MV/cm. 

When the transistor becomes very small, 
the characteristic degradation due to hot carri- 
ers and short and narrow channel effects become 
a problem. The stacked cell can use a large 
access transistor in comparison with the planar 
and trench cells. Alternatively, if the same 
size transistor is used, the stacked cell can 
have a smaller memory cell size than that 
of other memory cell structures. Figure 11 
shows that the subthreshold swing even in the 
submicron gate length is 80 mV/decade and 
the leakage current can be suppressed enough 
to eliminate the adverse effect on the charge 
retention characteristic. 


3.3 Development of 4-Mbit DRAM memory cell 
A 4-Mbit DRAM memory cell was developed 
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by further scaling the three-dimensional stacked 
capacitor cell developed for 1-Mbit DRAMs. The 
basic memory cell structure is common to 
1-Mbit and 4-Mbit DRAMs. The 1-Mbit DRAM 
incorporates three-layers poly-silicon and one- 
layer Al process technology, where polycide 
is used for word lines, and Al wiring is used 
for bit lines. The 4-Mbit DRAM uses further 
advanced technology having four-layers poly- 
silicon and a one-layer Al process. 

In the 4-Mbit DRAM, contacts with the Al 
word lines are made at eight positions in the cell 
array to minimize the delay time due to the 
polycide word lines on the first poly-silicon 
layer. The bit line is formed by the 4th layer of 
polycide on which it is easy to form a fine bit 
line pitch. This eliminates stray capacitances 
that would be occur if thick Al bit lines were 
formed. This design resulted in a ratio of bit line 
capacitance to cell capacitance Cp/Cs of about 
eleven, which is sufficient for signal sensing. 

In addition, a cell capacitance of 27 fF was 
realized by the development of a capacitor 
insulating film having a thickness of 10 nm 
(effective exide thickness) or less and by virtue 
of the three-dimensional structure of the stacked 
cell. Thus, a 7.5 um? cell was developed and put 
into use in a practical device. 

3.3.1 Four-layer poly-silicon process 

Figure 12 shows the process to make the 
4-Mbit DRAM memory cell. The substrate is 
p-type silicon. After isolation and formation of 
the n-well of CMOS for peripheral circuitry 
using conventional methods, gate electrodes, 
including those for the access transistor, are 
formed on the first poly-silicon layer (polycide). 
Then the source drain area is formed by ion 
implantation in Fig. 12a). After an oxide film 
is grown by the CVD method and the contact 
holes are formed by imprinting a mask pattern, 
the second poly-silicon layer for the storage 
nodes is grown. To process the second poly- 
silicon layer, which affects the storage capacity, 
accurate patterning was performed while 
avoiding the influence of the first poly-sylicon 
layer in Fig. 12b). For this process, new lithog- 
raphy and etching techniques to delineate 
an exact pattern of the reticule and a new 
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Fig. 12—Schematic veiw of 3D stacked capacitor cell 
and fabrication process. 


technique of producing accurate and defectless 
reticules were developed. 

After the capacitor dielectric film is formed, 
the third poly-silicon layer for the cell plate is 
grown. After the oxide films between layers are 
grown, the bit line contact holes are opened 
and bit lines are formed by the fourth silicon 
layer (polycide) in Fig. 12c). Then, aluminum 
word lines are formed by the conventional 
method in Fig. 12d). Figure 13 shows the cross- 
sectional SEM view. 
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Fig. 13—SEM cross sectional view of 3D stacked 
capacitor cell. 
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Fig. 14—Leakage current of capacitor film. 


3.3.2 Capacitor dielectric film 

A key technique of the process for the 
stacked capacitor cell is the formation of the 
capacitor dielectric film on the poly-silicon. 
The 4-Mbit DRAM requires a film thickness not 
more than 10 nm (effective oxide thickness). 
There is also a physical limit for the silicon 
oxide film thickness. When the thickness be- 
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Fig. 15—Stress current on lifetime (0.1% cumulative 
failure). 


comes 5 nm or less, the conductivity mechanism 
of the film is changed and its dielectric charac- 
teristics rapidly deteriorate. Therefore, the film 
cannot be thinner than this limit. 

To determine the minimum limit of film 
thickness, I-V characteristics were measured as 
shown in Fig. 14 using a film 5 nm thick (effec- 
tive oxide thickness) which is close to the 
physical limit, and by using a test pattern 
having a 40mm? capacitor area and having 
the equivalent steps as a 4-Mbit DRAM. 
This measurement confirmed that the leakage 
current per cell under the device operating 
conditions is not more than 107!6 A. 

In addition, the time dependent dielectric 
breakdown (TDDB) of the capacitor film was 
estimated by an accelerated test using constant- 
current stress (see Fig. 15). The operating 
life of the capacitor film calculated using the 
current acceleration factor obtained from the 
test result was essentially infinite even for a 
film thickness of 5 nm. 

The results of these measurements showed 
that the capacitor dielectric film on the poly- 
silicon has sufficient charge retention character- 
istics and operating life even when its thickness 
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Fig. 16—Master slice/wire bond option control circuitry. 


is close to the physical limit. Consequently, a 
capacitor film having a 7 nm to 8 nm thickness 
was selected considering fluctuations in the 
production process. 


4. Development of mega bit CMOS DRAM 
4.1 Eight types of 1-Mbit DRAM on the same 
chip 

This section explains the circuits and 
features of Fujitsu’s CMOS DRAMs that use the 
three-dimensional stacked capacitor cell and 
CMOS peripheral circuits described in the 
preceding chapter. 

The MB81C1000/1/2/3 series having 1-Mbit x 
l-bit organization and the MB81C4256/7/8/9 
series having 256-Kbit x 4-bit organization from 
eight different types of 1-Mbit DRAM fabricated 
on the same bulk chip. The type of DRAM 
product is selected by means of the aluminum 
master-slice and wire bonding in the assembly 
step. Figure 16 shows the control circuit for 
these DRAMs. When FP, and FP3 are pulled 
up to V.. or down to Vy, the FAST PAGE 
(FPG), NIBBLE (NB), STATIC COLUMN (SC), 
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Fig. 17—Address counter block diagram for nibble mode and serial access mode. 
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Fig. 18—Compariosn of row decoder between nMOS and 
CMOS. 


or SERIAL ACCESS (SA) mode can be selected. 

Figure 17 shows the address counter con- 
nections to provide the SA mode. In the 
256-Kbit x 4-bit organization, continuous 
2-Kbit data can be accessed at high speed by 
operating all counter bits AgC to AgC. If the 
two low-order bits of this counter are used, the 
NB mode can be provided. In the 1-Mbit x 1-bit 
organization, the counter operation is the 
same as in the 256-Kbit x 4-bit organization 
except that the address boundary is AgC to AgC. 

4.1.1 Power consumption 

The MB81C1000 series uses a p-type sub- 
strate and n-well CMOS technology to provide 
low power consumption and high speed oper- 
ation at the same time. Figure 18a) shows the 
circuit of the row decoders used in MB81256 
based on conventional nMOS _ technology. 
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During address decoding in MB 81256, all 
other decoders except the selected one repeat 
charging and discharging at every memory cycle. 
In the MB81C1000 shown in Fig. 18b), only the 
selected 1-bit decoder repeats charging and 
discharging. All other decoders are in the stand- 
by state (NODE JN, =H). Because of this feature, 
the gate capacitances of large transistors Q, to 
Q, are not charged and discharged every cycle, 
and unnecessary power consumption is avoided. 
Furthermore, ground noise and substrate noise 
caused by discharging can be eliminated, result- 


ing in stable operation of the sense amplifier. 
Because the reset level of the bit lines is set 


to about 1/2 V.., the charging and discharging 
current (which significantly affects the power 
consumption of the DRAM) is reduced to about 
35 percent that of the conventional V,, resetting 
method. In addition, each bit line is divided 
into four sections by the shared sense amplifiers 
located on both sides of the column decoder 
in the middle. This enables the elimination 
of charging and discharging bit lines that are 
not operated for reading and writing. With 
this design an effective reduction in power 
consumption is also achieved. 

4.1.2 Reliability 

Setting the bit line reset voltage to 1/2 Vo, 
results not only in low poer consumption but 
also in improved reliability. 
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Fig. 19—Photomicrograph of 4M DRAM (Chip size is (4.92 x 13.22 mm’). 


First, by making the voltage potential of 
the bit line reset level equal to that of the 
capacitor plate of the cell, the electric field at 
the capacitor film is reduced by half. This 
enables the capacitor film to be much thinner, 
reduces alphaparticle-inducec soft errors, and 
improves the time dependent dielectric break- 
down (TDDB) of the capacitor film itself. 

Second, the potential of the capacitor 
plate of the cell and the bit line reset voltage 
are set to follow the fluctuation in the power 
supply (V..). This stabilizes the read signal 
voltage to the sense amplifiers regardless of 
fluctuations in V,., and makes the device 
highly resistant to V,. noise (V bump). 

Third, boost circuits, including the word 
driver, can be eliminated and fully static circuits 
are used for all internal circuits. As a result, 
the memory not only has the advantageous 
feature of a CMOS circuit that is highly resistive 
to small leakage current but also eliminates the 
characteristic degradation, including that due to 
hot carriers. 

The fourth advantage of the 1/2 Voc reset 
system is the reduction in the peak current of 
the V.. power supply. Excessive peak current 
causes noise which adversely affect memory 
device operation. This has frequently caused 
troubles in users’ boards. 

In the MB81C1000 series, the peak current 
is lowered to 100mA or less by using the 
1/2 Veo reset system as well as other techniques, 
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making it possible to produce a device easy to 
use. 


4.2 Development of 4-Mbit DRAM MB814100/ 

814400 series 

A 4-Mbit DRAM that can be mounted in a 
300 mil dual in-line package (DIP) has been 
developed’”? through the incorporation of a 
memory cell having the three-dimensional 
stacked capacitor structure using four-layers of 
poly-silicon and the scaling of CMOS devices. 
Figure 19 shows a photograph of the chip. 

The major technical issue when mounting a 
4-Mbit DRAM to a 300 mil DIP is how to assure 
the cell area under the restrictions imposed by 
the package while maintaining cell capacitance 
and immunity to alpha-particle-induced soft 
errors. Considering the immunity to alpha- 
particle-induced soft errors of the stacked 
capacitor cell, Fujitsu has set the cell area at 
7.5 m?; this is the minimum reported cell 
area for a 4-Mbit DRAM. This cell area was 
selected because it enables a chip area of less 
than 70 mm? and because the memory device 
can be mounted in a conventional package. 

The development of the 4-Mbit DRAM 
MB814100/814400 series had three objectives: 
1) Electric characteristics, including the alpha- 

particle-induced soft error rate, must be at 

least equivalent to those of existing DRAMs. 
2) The 4-Mbit DRAM must be compatible with 
various packages and capable of being 
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Fig. 20—Block diagram of 4M DRAM (Address clocks shape whole chip, but each 
1-Mbit blocks has its own clock generator). 


produced in various types. 

3) High-quality and inexpensive memory must 
be supplied to users by using stacked ca- 
pacitor cells which have already been mass- 
produced. 

4.2.1 Design concept of 4-Mbit DRAM 

The chip area of the newly developed 4-Mbit 
DRAM is small (65 mm). It can be mounted 
not only to the 300 mil DIP but also to various 
packages such as the Small Outline J-leaded 
package (SOJ) and Zigzag In-line Package (ZIP) 
which have the same size as a 1-Mbit DRAM. 
The circuit design followed that of the 1-Mbit 
DRAM as much as possible but with improved 
power consumption and _ operating speed. 
The design improvements are the 1/2 Vo 
reset system for the cell plate and bit lines, 
determination of word x bit organization 
by wire bonding, and positioning of partial 
peripheral circuits in the middle of the chip. 

For a large-capacity DRAMs of 1-Mbit or 
more, the division of the memory cell array is 
very important for determining the overall 
characteristics of the DRAM. This is because the 
length of aluminum wiring in a chip is increased 
to 10mm to 20mm and the delay time in 
wiring becomes an important factor in the 
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DRAM speed. In many cases, the power supply 
and ground line may receive the noise generated 
when all decoders and sense amplifiers in the 
array are operated at the same time. This 
restricts the margin of device operation. In 
addition, electro-migration must be considered 
in order to determine the power line width. 

4.2.2 1-Mbit blocking organization 

Because the sense amplifier pitch in the 
4-Mbit DRAM can be reduced due to the use 
of polycide bit lines, 1024 sense amplifiers 
are positioned in an array in the Y direction 
(direction of shorter side). Therefore, the length 
of the shorter side of the chip is 4.84 mm; this 
is less than the maximum length for a plastic 
300 mil DIP. A cell array of 1024 columns x 
512 words (512 Kbits) is considered a unit. 
Eight blocks of this array are laid in the X 
direction (direction of longer side) to configure 
a 4-Mbit array. When compared with the 1-Mbit 
DRAM, the chip area of the 4-Mbit DRAM is 
increased by only 28 percent (comparison 
between Fujitsu products). 

Although the bit line reset voltage is set to 
1/2 V.~- to reduce the power consumption, the 
charging or discharging current of bit lines 
reaches 70 mA (fpgc = 180 ns) when all arrays of 
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Fig. 21—Timing diagram of 16-bit multi bit test (Dw, Dr, DE are write, read, expected data). 


the 4-Mbit DRAM operate at the same time. 
In the 1-Mbit DRAM, the charge and discharge 
current were reduced to three-fourths of the 
conventional value through the _ divisional 
driving of arrrays. In the 4-Mbit DRAM, only 
one-fourth of the arrays are driven and the 
current consumption by bit lines is reduced to 
about 18 mA. After current consumption by 
arrays is reduced, the power consumption of 
peripheral circuits becomes the next problem. 

The increase in power consumption due to 
the increase of DRAM capacitance, and the 
deterioration in access time due to the wiring 
delay time have been suppressed by improving 
the DRAM performance through the scaling 
of the transistor size. However, for a large 
capacity of 4 Mbits, the improvement in memo- 
ry device performance made only by scaling 
the transistor is approaching its limit. This 
is because the wiring delay time becomes the 
dominant performance factor as described 
before. 

To solve this problem, the blocking of 
circuits, including peripheral circuits, is used 
and the power delay product in the peripheral 
circuits is greatly improved (see Fig. 20). A 
1-Mbit array containing a cell array and a 
clock generator circuit to drive the array is 
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considered a unit block. The 4-Mbit memory 
is configured by four such blocks. During 
normal reading or writing, only the selected 
1-Mbit block is operated. Consequently, the 
chip can maintain high speed and low power 
consumption because it operates under an 
internal load as small as that of a 1-Mbit DRAM. 

4.2.3 Test mode 

Since the development of the 1-Mbit DRAM, 
the issue of increasing the test time as the 
memory capacity increases has arisen. This is 
a serious problem even for the 4-Mbit DRAM. 
For example, when a DRAM of 4-Mbit x l- 
organization is tested with a cycle time of 
300 ns, a test time of about 15s is required 
even when a simple marching pattern is used. 
For the 1-Mbit DRAM, the parallel test mode is 
activated by applying a voltage higher than 
Ve. to the Test Enable (TE) pin which used to 
be an NC pin. However, for the 4-Mbit DRAM, 
reduction of the test time is strongly desired at 
the board level and there is no unused pin. 
Therefore, the 4-Mbit DRAM is designed to have 
an 8-bit parallel test mode controlled by the 
TTL logic input. 

Table 2 lists the test functions employed for 
the 4-Mbit DRAM. Figure 21 shows their 
timing charts. The parallel test mode entry is 
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Table 2. Function of Multi bit test 


Organiza- Test entry| Test exit Result No. of 
- | 


tion _ | | MBT _ 
WE, CAS CAS before Pass = 1 
before _RAS or Fail = 0 


UX I RAS RAS only | from 8 
| (WCBR) | refrech Dout, 
Pass = 1 
Ditto above Fail = 0 8 


| from DOs | 


Fig. 22—Output wave form operating in a fast page 


mode. 


done by the WE, CAS before RAS (WCBR) 
cycle. The exit cycle is done by the RAS 
only refresh or CAS before RAS cycle. During 
test mode operation, refresh can be executed 
in either the simple read cycle or WCBR entry 
cycle. 

For the test result output, the Dout pin 
outputs “1” for “‘pass” when all data of the 
eight parallel read bits matches, and ‘“‘O”’ for 
“fail”? when at least one bit of data does not 
match. The 3-state output method which uses a 
high impedance state for test result output is 
not used, thus the test can be executed easily 
on the borad. 


4.3 Characteristics of 4-Mbit DRAM 

The 4-Mbit DRAM designed as described 
above operates at high speed with low power 
consumption. Figure 22 shows the output 
waveform of the DRAM in the fast page mode. 
The measurement conditions are: power voltage 
Voc is 5 V, ambient temperature is 25 °C, and 


314 


tre = 180 ns 


To, (MA) 


4.0) 4.5 5.0 5.5 6.0 


Fig. 23—Average Vc¢ current (/¢¢}) vs Vcc voltage. 
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Tea 
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Fig. 24—Average Vcc current U/¢c]) vs operational 
frequency. 


RAS-CAS delay time frcp is tRCD max: AS 
shown in the figure, the RAS access time is 
typically 56ns. This is faster than that of 
256-Kbit and 1-Mbit DRAMs. 

Figure 23. shows the average operating 
current (V.,) dependency of the power voltage 
(ec). Figure 24 shows the cycle time de- 
pendency of J,,). For reference, data of a 4-Mbit 
DRAM is compared with data of a 256-Kbit 
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Fig. 25—Accelerated alpha particle induced soft error 
result. 


DRAM (MB81256) using nMOS technology. 
Under typical conditions, the operating current 
of the 4-Mbit DRAM is 34 mA, while that of 
the 256-Kbit DRAM is 48 mA. The comparison 
reveals a large reduction in the operating cur- 
rent. This results from the array division. The 
memory cell array is divided into eight sections 
and configured in four blocks consisting of 
independent 1-Mbit blocks, including the 
peripheral circuits to drive the arrays. 

The 4-Mbit DRAM has the same power 
consumption as a 1-Mbit DRAM because it 
executes operations equivalent to that of a 
1-Mbit DRAM using its internal circuits, includ- 
ing peripheral circuits. This is a large advantage 
for PC board assembly. The memory board 
capacity can be increased four times by using 
the 4-Mbit DRAM without changing the power 
supply or cooling system. 

The peak current of the 4-Mbit DRAM is 
rather low (100 mA) compared to a 256-Kbit 
DRAM. 

Reduction of the alpha-particle-induced soft 
error rate is a large problem if the reliability of 
mega-bit DRAMs is to be increased. Figure 25 
shows an example of the test results for soft 
errors using an accelerated test. In this test, alpha 
rays were irradiated onto the chip surface and 
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the soft error rate was measured while altering 
the DRAM operation cycle time. As shown in 
the figure, the major cause of soft errors is the 
bit line mode. Few soft errors are observed in 
the cell mode partly because only a small charge 
is collected by the stacked capacitor cell’): It 
has already been confirmed that the soft error 
rate of the 4-Mbit DRAM is lower than that of 
the 1-Mbit DRAM because of the scaling of the 
p-n junction area!) 

Regarding the packages for the MB814100/ 
814400 series, the 300 x 675 mil? SOJ can be 
used as described previously. In addition, new 
300 mil DIP and 400 mil ZIP are under develop- 
ment. The JEDEC standard for the 4-Mbit 
DRAM package has not yet been established 
(except 350 x 675 mil? SOJ) because of the 
large restriction imposed by the various chip 
sizes of different manufacturers. If the JEDEC 
standard is established, Fujitsu. will develop 
the corresponding package. In addition, Fujitsu 
plans to study the possibility of using a 300 mil 
SOJ having compatibility with a 1-Mbit DRAM 
in order to realize a single in-line module (SIM) 
mounted with 4-Mbit DRAM. 


5. Future objectives 

Currently, 1-Mbit DRAMs are mass-produced, 
4-Mbit DRAMs are being accepted in the mar- 
ket, and the concept of 16-Mbit DRAMs is 
being considered. In this age of mega-bit capaci- 
ties, the device and process technologies are 
changing rapidly. 

As the process technology improves, a new 
concept of memory cell technology is required. 
Fujitsu plans to promote the further miniaturi- 
zation of stacked capacitor cells to be used for 
16-Mbit DRAM memory cells. 

As a part of this advance in technology, 
Fujitsu. presented a DIET capacitor cell at 
IEDM in 1986. DIET combines the advantages 
of both the stacked cell and trench cell. It can 
theoretically achieve a large cell capacitance and 
could be realized by burying a three-dimensional 
capacitor cell into an insulated trench capsule. 
In addition, a system to supply cell-plate voltage 
from the capsule layer in the substrate has been 
developed. To supply cell-plate voltage from 
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inside the substrate is a new system. 

As shown by the above discussion, the 
process technology to realize a new memory 
cell must be found by trying various process 
technologies and looking at all possibilities. 
Such methods of development do not determine 
one choice only, but also expand the overall 
potential. Fujitsu will continue to develop the 
technologies for those device designs that 
respond to the diverse needs of the market. 

In device technology, Fujitsu. plans to 
develop products having added value and more 
functions in the field of ASICs (including 
video products) that are based on the general- 
purpose products described in this report. 
Fujitsu will continue to introduce high-quality, 
high-performance devices for the market. 


6. Conclusion 

Quadruple integration every three years has 
still been maintained in the Mega bit era. The 
3D STC was the key technology for this steady 
progress of DRAM development. To develop 
this type of DRAM cell, over-all process design 
was needed, such as fine lithography, ultra 
thin capacitor film, cell capacitance, and a 
immune cell structure. 

Combining the high performance CMOS 
DRAM circuits with the STC cell technology, 
we developed the industry’s smallest 4-Mbit 
DRAM having an access time of 56 ns. 
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Sea of gates was introduced as an LSI that can integrate circuits of system level including 
memory functions. With this new type of LSI, highly integrated and high-performance 
CMOS LSI of 30K to 160K gates have been successfully developed. The LSlIs are fabricated 
with 1.0 um or 1.2 um CMOS triple-metal-layer process technology. An original basic cell 
structure makes it easy to construct both memories and random logic circuits on the LSI 
chip. Furthermore, the unique structure of an 1/O buffer cell and the improvement in 


assembly technology realized multiple pins. Cavity-down packages of up to 401 pins were 


developed. 


1. Introduction 

Because of advances in the LSI design 
environment, customer design of LSI has 
become comparatively easy and there are an 
increasing number of engineering work stations 
(EWS) in use. At the same time as these im- 
provements, customer-specific LSI devices called 
Application Specific Integrated Circuits (ASICs) 
have been developed. These devices significantly 
reduce the cost of the products and are unique 
to the customer. Due to their high integration 
and short delivery time, the CMOS gate array 
in particular now has a strong position in the 
ASIC field as the products of related electronic 
equipment diversify and their life cycle is 
shortened. 

Larger scale and higher-performance CMOS 
gate arrays are now possible because of advances 
in LSI design technology and semiconductor 
manufacturing technology. To make them 
compact, high density, large-scale systems using 
LSI, logic circuits and memory functions such as 
RAM and ROM must be packaged onto the 
LSI chip. Taking this into consideration, we 
studied the production of a CMOS gate array ” 
with a RAM built into a dedicated area. How- 
ever, these approaches could not adequately 
meet the various system requirements because 
of severe restrictions on the number of memory 
devices to be packaged and on the bit/word 
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configuration. The Sea of Gates is a new CMOS 
gate array chip architecture designed to over- 
come these problems. In the Sea of Gates, 
we used the triple-metal-layer process technolo- 
gy. This technology has been used successfully 
in the mass production of 20K-gate CMOS 
gate arrays and in production of basic cells 
with structures suitable for memory cell 
configuration. Firstly, five types of CMOS 
gate arrays with 30K to 100K gates were 
produced as the AU series. The 160K-gate Sea 
of Gates was then developed using 1.0 um 
process technology. 

The Sea of Gates can be used in such areas 
as industrial computers, high performance 
metrological equipment, and in digital audio 
equipment such as VTR and DAT. 

This paper describes the development 
and features of Fujitsu’s Sea of Gates. It also 
discusses future problems. 


2. Description and development of the Sea of 
Gates 
2.1 Sea of Gates 
As shown in Fig. 1, Fujitsu CMOS gate 
arrays incorporate the latest in process tech- 
nology. They are highly-integrated high-speed 
devices. A fixed channel CMOS gate array is 
used for devices containing up to 20K gates. 
This LSI device consists of many basic cells 
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of a 2-input NAND gate. Circuits are configured 
by arranging the cells into columns and then 
wiring them ona standard silicon chip that has a 
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Fig. 1—Trend of speed and complexity in Fujitsu CMOS 
gate arrays. 
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wiring channel area between the columns. 

The Sea of Gates is also called a channel- 
less type gate array. The basic cells are 
distributed over the entire internal chip area 
without any wiring area between the cell 
columns. The customers’ circuits are obtained by 
simply using the wiring process by which the 
standard silicon chip is provided. This method 
is exactly the same as that use for the fixed 
channel type CMOS gate array. Figure 2 shows 
the configuration of the fixed channel type gate 
arrays and the Sea of Gates. 

The wiring channel region of the fixed 
channel type gate array has a redundancy that 
allows a computer to automatically connect 
complicated wiring. Because of this, the integra- 
tion cannot be increased. However the wiring 
channel width and position of the Sea of Gates 
can be freely set by using the area over basic 
cells as wiring regions when required. For this 
reason, integration is higher than in fixed chan- 
nel type gate arrays. Since the Sea of Gates can 
spread a hard macro such as a multiplier circuit 
including RAM and ROM over two or more 
basic cell columns, a high-performance hard 
macro can be obtained with a high area ef- 


I/O buffer cell 


b) Sea of gates 


Fig. 2—Comparison of chip configuration. 
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Fig. 3—Total wire length vs. complexity. 


ficiency and an optional size. 
with a high area efficiency and an optional size. 

As shown in Fig. 1, we were the first in 
the world to use the triple-layer-metal process 
technology in a 20K-gate CMOS gate array 
and apply it to the Sea of Gates. Because the 
wiring length exponentially increases with the 
gate scale, the integration does not increase 
for the conventional double-layer-metal process 
technology. Large-scale cells such as RAM and 
ROM can often be packaged into the Sea of 
Gates. In double-layer-metal process technology 
the wiring is blocked by these cells. This con- 
siderably reduces the wiring efficiency. In 
triple-layer-metal process technology, if these 
cells consist of the first and second metal layers, 
the reduction in wiring efficiency can be 
prevented because the third metal layer can pass 
over the cells. 

Figure 3 shows the total wiring length of 
an LSI device versus the integration of Fujitsu’s 
CMOS gate array. The total wiring length of 
the LSI device increases exponentially with 
the integration of the LSI device. In areas in 
which the integration exceeds 10000 gates, 
the wiring efficiency can no longer be increased 
by the double-metal-layer wirings only. 

As can be seen from the figure that com- 
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Fig. 4—CAP LSI. 


pares the double and triple-metal-layer wirings, 
the wiring efficiency is improved by the triple- 
metal-layer wirings. The triple-metal-layer tech- 
nology is especially effective for the Sea of 
Gates. 


2.2 Details of development 

First, we manufactured a prototype Sea of 
Gates of 29234 basic cells on a 13x 13mm 
chip”? with a gate length of 1.8 um using the 
double-metal-layer process. The basic cells 
had a total of eight transistors, four n-channel 
transistors, and four p-channel transistors. 

These Sea of Gates were used to manufac- 
ture an approximately 24K-gate cellular array 
processor (CAP) LSI device?” . Figure 4 shows 
a photograph of the chip (CAP LSI). Although 
we used the double-metal-layer process, we 
achieved a high basic cell utilization of about 
71 percent. This was achieved by packaging 
cells that could be repeatedly used as the hard 
macro such as first-in-first-out (FIFO). 

Next, when marketing these devices, we 
improved the basic cells. The basic cells were 
structured so that the hard macro of the con- 
ventional Fujitsu CMOS gate array could be 
used when logic cells (hereafter referred to 
as a unit cell) such as 2-input NAND gates 
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Fig. 5—Structure of basic cell. 


were formed. The cells were also structured 
so that the SRAM and ROM memory cells 
could efficiently be configured. By using this 
basic cell structure, we achieved a high-order 
compatibility and sharing of libraries for our 
fixed channel type CMOS gate array. 

We developed a CAD tool (cell compiler) 
that automatically generates the SRAM or ROM 
required by the customer as the hard macro. 
We also developed a layout program for the 
triple-metal-layer technology. 

We then put the AU series on the market 
as our first Sea of Gates. 


3. Features of our Sea of Gates 
3.1 Basic cell and metal wiring 

Figure 5 shows the basic cell layout pattern 
and its equivalent circuit. The basic cell consists 
of two types of elements, types A and B. 
Type A is composed of two pairs of n-channel 
and p-channel transistors with a common gate 
electrode. Type A has the same shape as our 
fixed channel type CMOS gate array basic cell. 
Type Bis composed of four n-channel 
transistors with small channel widths. The gate 
electrodes of adjacent type Bs are common. 
The two adjacent type Bs are of a point sym- 
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Fig. 6—Placement of unit cells and channel allocation. 


metrical type for the center of the layout 
pattern so that they can be laid out even if 
the cell containing them is rotated by 180 °. 
Since the conventional unit cell is 
constructed using type A only, type B adjacent 
to it is used as the wiring channel region. 
Figure 6 shows examples of the unit cell layout 
and wiring channels. If the wiring channels 
are insufficient in such placement of unit cells 
as shown in Fig.6a), the type A section is 
also used as a wiring channel as shown in 
Fig. 6 b). The triple-metal-layer process increases 
the vertical wiring channel by about 50 percent. 
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Fig. 7—Memory cells of dual-port SRAM. 


USER > FUJITSU 
SYSTEM >MB63x999; 
REVISION :0001 : 
DATE OB / 117 1-8: 
DESIGNER : FUJITSU 
NAME > RAMO0 
FUNCTION: PCRAM 


SEGMENT 

BIT : 12 
WORD > 1024 
COLUMN : 1 
ENDSEGMENT; 
ENDNAME 
ENDUSER 


Fig. 8— Description example of input data for the cell 
compiler. 


3.2 Memory cell 

The type B section of the basic cell was 
added to efficiently configure the SRAM 
memory cell; type B is used as the transfer 
gate of an SRAM memory cell. In the memory 
cell, type A is used to form a latch. One basic 
cell can configure one bit of a single-port SRAM 
memory cell. Figure 7 shows the layout pattern 
and an equivalent circuit of the dual-port SRAM 
memory cell. Also in the dual-port SRAM, 
one basic cell can create one bit of the SRAM 
memory cell. By configuring the memory 
cell as described above, the single-port SRAM 
using the Sea of Gates can be made in an area 
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of about 2.5 times, in ease of a 16K-bit SRAM, 
as compared to the area required when using 
the same process technology of our standard 
cell method. If a circuit equivalent to the single- 
port SRAM is constructed by combining circuits 
in a fixed channel type CMOS gate array, 
the area will exceed three times that of the 
Sea of Gates. 

The ROM memory cell can create four 
bits with a single basic cell. In this case, only 
the type B n-channel transistors are used. 

SRAM  and_~ ROM = are 
generated by the cell compiler described below. 


automatically 


3.3 Cell compiler 

When packaging SRAM or ROM in an LSI 
device, the bit/word configuration normally 
differs according to the customer’s require- 
ments. Therefore, SRAM or ROM required by 
the customer must be created at the beginning. 
For this purpose, we developed a cell compiler 
for SRAM and ROM. An example for SRAM 
is shown in Fig. 8. The cell compiler automatical- 
ly generates the layout pattern and the libraries 
such as the cell simulation model used in logic 
simulation by wiring and entering the cell 
specifications in a simple language. Figure 9 
shows a layout example of a single-port SRAM 
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Fig. 9—Layout example of single-port SRAM 
(8-bit x 128-word). 


Basic cell array 


Fig. 10—Placement example of I/O buffer cells. 


generated by the cell compiler. 

We have been working towards automatic 
generation of up to four types of layout patterns 
for one logical specification of these cells. 
Such generation increases the layout efficiency 
of the LSI chip when SRAM or ROM is used. 


3.4 1/O buffer cell 


The I/O buffer cell of the internal/external 
circuit interface of the LSI device is made by 
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Fig. 11—Circuit example of I/O buffer cell. 


dividing a cell into a dedicated cell (external 
I/O) laid out on the chip peripheral region and 
logic cell (internal I/O) laid out on the basic 
cell region; these two cells are then connected 
to each other. The internal I/O consists of type 
A basic cells. Figure 10 shows a layout example 
of an I/O buffer cell of an actual chip. The 
internal I/O is best laid out close to the external 
I/O to be connected to avoid reduction in the 
I/O buffer cell performance. 

Figure 11 shows a bi-directional buffer as 
an example of an I/O buffer cell. 

In the conventional CMOS gate array lay- 
out, the I/O buffer cell region in the chip 
periphery uses all the circuit elements to form 
the bi-directional buffer. However, since the 
chip peripheral region in this system only uses 
the circuit elements that form the external I/O 
in Fig. 11, the area of the peripheral region of 
the chip occupied by a single I/O buffer cell 
can be reduced. As a result, the number of I/O 
buffer cells (number of I/O pads) of the chip 
and the number of signal pins were increased. 
The 100K-gate AU series has 400 I/O pads 
including the power supply pads on a 14.5 x 
14.5 mm chip. 

The internal I/O is a unit cell and can 
easily be configured using basic cells. I/O buffer 
cells with various functions can easily be created 
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by changing the internal and external I/O combi- 
nations. 

With the increasing speed of devices, mal- 
functions caused by simultaneous switching 
noise and ringing noise of output pins are 
becoming apparent. We have solved these 
problems by providing various noise reduction 
circuit buffers. 


3.5 Package 

To cope with an increase in the number 
of I/O pads in the Sea of Gates, we have 
developed pin grid array (PGA) packages. 

These packages are available with 299, 321, 
361 and 401 pins in addition to the standard 
packages of up to 256 pins. 

The 299-, 321-, 361-, and 401-pin packages 
use a cavity down method to improve the heat 
dissipation and can be fitted with cooling 
fins. 

To guarantee the quality of the LSI device, 
the LSI chip junction temperature during 
operation must be kept below a certain value. 
We recommend a maximum junction tempera- 
ture of 150 °C for ceramic packages. 

The chip junction temperature increases 
with the power consumption of the LSI device. 
In the Sea of Gates, the power consumption 
may exceed 5 W depending on the number of 
gates and the operating frequency. Figure 12 
shows the relationship between integration 
and the calculated value of power consumption 
during operation of the AU series of the Sea 
of Gates. However, assume that about half 
of the number of gates are occupied by SRAM 
and that the number of I/O signal pins of the 
LSI device is the maximum in the series. 

At a power consumption of 5W, a still 
air heat resistance (ceramic package) of 22 °C/W 
and an ambient temperature of 70 °C, the junc- 
tion temperature reaches 180°C. If fins are 
fitted and the LSI device is forcibly cooled 
with air at 3 m/s, the junction temperature 
can be kept below 100 °C. 

Figure 13 shows a photograph of the 401- 
pin PGA package. Wire bonding is used. 
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Fig. 12—Power dissipation vs. complexity in AU series. 


Fig. 13—A 401-pin PGA package. 


3.6 AU series 

Table 1 lists our AU series CMOS Sea of 
Gates. 

Since a single basic cell can create one 2- 
input NAND gate (actually only type A is used), 
one basic cell is equivalent to one gate. 

Table 2 lists the main features of the AU 
series. Table 3 lists the AU series package 
options. 


3.7 160K-gate Sea of Gates 

3.7.1 Features of LSI device 

To expand our Sea of Gates series, we 
developed a Sea of Gates with about 160000 
basic cells on the chip”. This LSI device has 
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Table 1. AU series CMOS sea of gates 


oe 


Usable 
Number basic cells | Number 


of basic resales 

Seon Spare of0 
“C-100KAU /MB630XXX 102 144) 76608 | 400 — 
C-7SKAU_|MB631XXX/ 75140] 56355 | 344 
C-SOKAU |MB632XXX| 52164| 39123 | 288 
C-40KAU |MB633XXX/ 41184/ 30888 | 256 


ee eI: 
C-30KAU |MB634XXX| 31500; 23625 224 


Device name | Part number) 


Table 2. Main features of AU series 


Process 1.2 um triple-metal- 


as i ayer CMOS 
2-input NAND gate delay 

“(F/O = 2) 0.8 ns 

Power 2-input NAND gate 0.58 ns 


__ delay (F/O = 2) 
18Kbits single-port SRAM 57 ns 
Access time i, 
Configurable SRAM, ROM 
Single, dual & triple-port SRAM (32 bits-18 K bits 
_ROM : 7 _|128 bits-64 K bits 

Gate utilization 
Logic with SRAM or ROM 
Logic only 


75% max 
50% max 


Table 3. AU series package options 


Device | ee 8 grid slid ieee paces 
135 179|208 256 299/321/361|401/120|160 
C-10KAU]o |o |o |o [o Jo fo fo | i= 
C-7SKAU;|O |Oo |O0 |}0 |}O |O | 0 | 
G50KAU| 0 |o |o |o Jo fefet TT 
C-40KAU|0 }0 | 0 Jo | oun ke) 
C30KAU;o lolo| | [ lo Jo. 


a gate length of 1.0 um and was obtained using 
the most advanced process technology with 
greater microminiaturization than that of the 
AU series. Table 4 lists the basic features of 
the 160K-gate Sea of Gates. The actual measure- 
ment results of the ring oscillator show that 
the propagation delay time of the inverter 
with a fan out of 1 is 230 ps. The maximum 
number of usable basic cells is about 120000 
(75 percent) when SRAM or ROM is used. 
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Table 4. Basic features of 160K-gate sea of gates 


Technology p-substrate, twin-tub 


Gate length 1.0 um 
First metal pitch 2.9 um 
Second metal pitch 3.8 um 
Third metal pitch 5.8 um 


Chip size | 14.5 mm x 14.5 mm 
Total basic cell counts - 160, 170 
Total 1/0 cell counts 400 

Supply voltage =H 5 V - : 


Inverter delay (ring oscillator) 230 ps 


rr Fer POA PA eer reer ee a tty 


TOR A Ae Ae eee 


Pe RR SUR Lk 2am kana 


put RE a Be ee 


Fig. 14—FDSP4 on sea of gates. 


3.7.2 Test chip 

Our 160K-gate Sea of Gates achieved the 
same circuit function as that of our 32-bit 
floating point digital signal processor (FDSP 4: 
MB86232). Figure 14 shows the FDSP4 realized 
on a Sea of Gates. RAM and ROM were auto- 
matically generated by the cell compiler. The 
decoder section of the original LSI device 
was composed of nine PLAs. However, for this 
test chip, the decoder section is composed of the 
logic gates that have been expanded using our 
logical synthesizer system “ZEPHCAD®”. 

Table 5 lists the circuit organization of the 
test chip. The basic cell utilization is 72.5 per- 
cent. 
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Table 5. Circuit organization of the test chip 


32-bit x 512-word dual-port SRAM x 2 | 55 384 basic cells 


32-bit x 16-word dual-port SRAM —__5 292 basic cells 


"32-bit x 1 024-word ROM 12 672 basic cells 
Random logic 42 798 basic cells 
116 146 


Total basic cell counts hacie calle 


72.5 percent 


Basic cell utilization 


4. Future problems 

The high integration and speedup of the 
LSI device have raised various problems with 
LSI design and practical use. The Sea of Gates 
is no exception to these problems. This section 
describes the problems and methods of solving 
these problems. 

A increase in LSI circuit scale significantly 
decreases the development efficiency. This is 
because of a large increase in the work required 
for logic design, logic simulation, and test data 
creation and in the time required for computer 
processing. To make the most of the Sea of 
Gates’ advantage of quick prototyping, the 
development time including the logical design 
time must be reduced. 

To achieve these targets, the following 
should be considered: 

1) Logic design 
i) Use of a tool that automatically 
generates FIFO and ALU. Use of library 
such as the functional blocks. 

ii) Use of logical synthesizer system 

This system writes the LSI logical circuit 
specification with a high-level language 
such as the finite state machine language 
and then automatically converts it to 
the logical circuit that uses the LSI 
cell library. 

2) Logic simulation 

For logic simulation of an LSI device 
exceeding several tens of thousands of gates, 
the simulation tool must process large amounts 
of data and must therefore be speeded up. 
The use of the dedicated system called the 
simulation accelerator also helps us to solve 
these problems. If this system is used, the 
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processing time will be reduced up to least ten 
times that of EWS simulation. 
3) Test data creation 

The work required to create the test data 
increases greatly as the circuit scale increases 
and can no longer be handled manually. 

To solve this problem, the following should 
be considered: 

i) Logic design that uses scan-flip-flop and 

automatic generation of test patterns” 

ii) Built-in test circuit® 

iii) Unit test of megacells. 

This method configures the circuit so that 
megacells such as RAM and ROM packaged 
on the LSI device can be tested directly using 
the LSI external pins. 

4) Use of automatic layout tool 

To achieve high performance of a large- 
scale LSI device, the layout of the LSI device 
is best designed by a logic designer. Recently, 
a dedicated automatic layout tool that can 
be used by the customer has been developed. 
We think that this tool will be increasingly used. 

The Sea of Gates has the following 
problems: 

1) Increase in power consumption 
2) Switching noise resulting from the speedup 
of the LSI output buffer. 

The high-speed operation of the output 
buffer causes noise in the PC board on which the 
LSI device is mounted. This noise causes various 
problems. There are two types of noise: simul- 
taneous switching noise and a ringing/reflected 
noise. The first type is suppressed by reinforcing 
the ground inside the LSI device and by restrict- 
ing the number of simultaneous switching pins 
for each ground. The latter is caused by the 
high-speed switching of the LSI device, stray 
capacitance of the PC board, and impedance. 
This noise is suppressed by connecting damping 
resistors and terminating resistors on the PC 
board. 


5. Conclusion 

A variety of large-scale ASICs containing 
more than 1OOK gates have been able to be 
developed using the Sea of Gates. These devices 
can be manufactured in a short period and also 
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in small quantities, as required. 

Although there are problems such as the LSI 
development efficiency and packaging noise 
for the large scale ASIC, we believe we can 
solve these problems and develop the Sea of 
Gates that meets the customer’s requirements. 
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The Fujitsu Flexible Microcontroller (F7MC) has been developed to meet the market's need 
for a high-performance application specific controller. This microcontroller features high- 
speed (0.33 us cycle time), efficient object code, and flexible design and can be applied to 


many areas. 


The CPU architecture, design philosophy, and technology features of the CPU are described 


and various application products are shown. 


1. Introduction 

Microcomputers have been used in various 
types of electronic equipment since the 4004 
microcomputer was introduced in 1971. De- 
pending on the application, microcomputers can 
be classified as data processors (MPU) such as 
the CPU used in personal computers, and real- 
time processing controllers (MCU) that are built 
into household appliances and office business 
equipment. The MCU integrates the CPU and all 
control and storage circuits including those for 
peripheral hardware devices and memory on a 
single silicon chip. It is also called a single-chip 
microcomputer. 

To meet the above requirements, we devel- 
oped an 8-bit CPU core having a 16-bit internal 
processing feature called the Fujitsu Flexible 
Microcontroller (F7MC). Based on this CPU 
core, we have developed various MCU products. 
Chapters 2 to 4 outlines the CPU core and its 
features. Chapter 5 introduces the various MCU 
products. 


2. Architecture of F7>MC CPU 
2.1 Market requirement 
Figure 1!) shows the market shipment sta- 
tistics and forecast for this classification. The 
figure shows that market demand for 8-bit MCUs 
is very strong because the application fields have 
expanded. Many new technical requirements are 
now desired; these are summarized below. 
1) High-speed performance corresponding to 
the high performance of the system. 
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2) High efficiency of object codes suitable for 
single-chip structure. 

3) MCUs that form the foundation of Applica- 
tion-Specific Integrated Circuit (ASIC) 
microcomputers that enable the system to 
be differentiated. 

4) Debugging capability that facilitates the 
development of complex programs. 


2.2 Development goals 
To determine the F*MC CPU architecture, 
we established the following goals according to 
the market demand described in section 2.1. 
1) Realization of high-speed performance 
High-speed instructions must be achieved. 
High-speed performance when the task environ- 
ment is switched (i.e. high-speed response for 
interrupt processing) must also be achieved. 
2) Improvement of object code efficiency 
A single-chip microcomputer in which the 
capacity of mounted ROM is physically re- 
stricted requires high object code efficiency. 
This is especially true if it is to be economical. 
3) An ASIC design and manufacturing processes 
that can quickly cope with the market and 
user requirements must be developed. 
4) Improvements in debugging 
A system that can debug a realtime process- 
ing program even in the mounted state must be 
realized. 
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Fig. 1—Shipment of microcontroller units!?. 
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Fig. 2—Resister set of FMC. 


2.3 Implementation of architecture 

This section describes how the goals de- 
scribed in section 2.2 have been achieved. 
1) Register set 

Figure 2 shows the F*MC register set. The 
accumulator of this register set has a unique 
design. In addition to the conventional accumu- 
lator (A), a temporary accumulator (T) is pro- 
vided. When data is transferred to A, the previous 
contents of A are automatically transferred to T. 
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The example of logical AND of #74 and #05 is 
shown below. 


a) MOV A. #74 
b) MOV A. #05 
c) ANDWA 


The dataflow of operations a) to c) are as follow: 


a) #74 is transferred to A, and at the same time 
the contents of A are saved in T. 


(Co + 
t 


#74 
b) #05 is transferred to A, and at the same time 
the contents of A (#74) are saved in T. 


Cos] + a] 
t 


#05 


c) The logical AND of A (#05) and T (#74) results 
in A, where as the contents of T holds the same 
value. 


Fig. 3—Operation of temporary accumulator. 


Instructions are provided so that two-term oper- 
ations having low frequencies of use can be 
executed between A and T. Thus, two-term 
operations can be executed without deterio- 
rating the object code efficiency. 

Another feature is that the register size that 
changes with the size of the task can be adjusted 
in the built-in RAM by providing the register in 
memory. At the same time, the overhead for 
task switching is reduced. 

Also, the pointer register can use memory 
registers RWO and A in addition to the one 
index register. Figure 3 describes an example of 
a term operation between A and T. 

2) Instruction set 

Several benchmark tests were performed to 
determine the instruction set. As a result of 
these tests, short codes were assigned to the 
transfer and branch instructions that occur 
frequently in the applications that occur fre- 
quently in the MCU application program. 
However, even if branch instructions did not 
occur frequently (such as in multiplication and 
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division) there were cases in which the speed 
and object code efficiency of the two-term 
operations severely deteriorated when the in- 
struction function was not provided. To avoid 
this, the A-T operation instructions described in 
article 1) were used. Figure 4 shows the occur- 
rence frequency in comparison with a general 
8-bit CPU”. 

The result is a powerful instruction system 
including the operation of a 16-bit register that 
is successfully contained in an 8-bit operation 
code (see Table 1). 

3) Interrupt 

Because a high-speed interrupt was con- 
sidered important, the save/return path when 
an interrupt occurs uses a direct path with a 


General 8-bit 
standard CPU 
) 


Percent 


(-]: Mathematical type : Transfer type 


(J: Branch type (1): Other types 


Fig. 4—Constitution of instruction set. 


32-bit RAM instead of the conventional 8-bit 
data path. For this reason, the stack pointer uses 
a hardware stack pointer that is independent of 
the software stack pointer for subroutine calls. 

Using this mechanism, registers A, T, PC, 
and PS were able to be saved within four bus 
cycles. 

4) Mode 

To expand the applications of the CPU, 
various modes including an external memory 
mode are required. 

A conventional MCU sets the modes by 
controlling the pins. However, the pins are an 
important resource in a single-chip microcom- 
puter. Therefore, only the vector was fetched at 
the start (either internally or externally from 
the chip) by controlling the pins. The mode is 
assigned from the third byte of the vector at the 
start. 

As a result, many modes can be provided 
including a test mode and one-time ROM mode. 
5) Debugging 

To improve the debugging efficiency during 
program development, we used a system in 
which the debugging signal was sent to and re- 
ceived from the emulator using the pins at the 


Table 1. Operation code map 


L 0 1 2 3 4 5 6 7 8 9 A B (6 D E F 
9 [NOP |PF1 DECW |INCW |CBNE_ |XCHW | XORW | ANDW |MOV MOV |MOV MOV | ADDW SUBW 
IX IX RO,#,rel/A,T JA A @A,T dir, A |@IX+off, A |@RWO IX,A  |IX,A 

, [PFO |TEST |DECW |INCW |CBNE |EXT |ADDW |SUBW |MOV MOV |MOV MOV CMPW_ | CMP 
| A A A R1, #rel A A A, @A A,dir | A, @IX+toff | A, @RWO A A 

> |MOVW |MOVW |RORC |ROLC |CBNE |MOV_ /ADDC |SUBC |ADD ADD |ADD ADD ADDDC | SUBDC 
1 A, IX |A A R2, #,rel|HSP, A |A A @A A, dir |A, @IXtoff | A, @RWO A A 

3 |MOVW |MOVW |SHRW |SHLW |CBNE |MOV |ORW | NOTW |SUB SUB _|SUB SUB i DIVU. |MULU 
SP,A |A,SP |A A R3, #,rel|A, HSP |A A @A A,dir |A, @IX+off |A,@RWO A A 

4 |MOVW |MOVW |ADDW |SUBW |CBNE |MOVW |DECW |INCW |MOVW_ |MOVW |MOVW MOVW SWAP 
RWO, A|A, RWO|A, RWO|A, RWO|R4, #,rel |RWO, #/RWO |RWO |@A,T dir, A |@IX+off, A |—RWO, A 

5 |MOVW |MOVW |ADDW |SUBW |CBNE /MOVW |DECW |INCW |MOVW | MOVW |MOVW MOVW RETI BRA 
RWI, A|A, RWI1|A, RW1|A, RW1|R5,#,rel|RW1, #/RW1 |RW1 |A,@A A,dir |A, @IX+off |A,@RWO rell0 

6 |MOVW |MOVW |ADDW |SUBW |CBNE |MOVW |DECW |INCW |CBNE JMP JMP SWAPN 
RW2, A|A, RW2|A, RW2|A, RW2|R6,#,rel|RW2,#/RW2 |RW2 |A,#,rel ext @A 

| | 

7 |MOVW |MOVW |ADDW |SUBW |CBNE /|MOVW |DECW |INCW |CWBNE CALL XCHN CALLV|MOVN | RET 
RW3, A|A, RW3|A, RW3|A, RW3|R7, #,rel |RW3, #/RW3_—_|RW3__|A, #, rel ext A, @RWO | vet A, #4 

g [MOV |MOV |ADD |SUB| |DBNE |MOV |DEC |INC  |SETB MOV |MOVW MOVW PUSH |BZ/BEQ 
RO,A |A,RO |A,RO |A,RO |RO,rel |RO,# |RO RO bit A,# |A,# ext, A A rel 

9 |MOV |MOV |ADD |SUB_ |DBNE |MOV |DEC |INC  |CLRB MOV |MOVW MOV PUSHW |BNZ/BNE 
R1,A |A,RI J|A,R1I |A,RI |Ri,rel |R1,# [RI RI bit RP,# |IX,# ext, A A rel 

a [MOV |MOV |ADD |SUB_ |DBNE |MOV |DEC |INC — |MOVB ADD |ADDW MOVW PUSHW |BC/BLO 
R2,A |A,R2 |A,R2 |A,R RD re R2.# |R2 R2 bit, A A,# |A,# JA, ext Ix rel 

p [MOV |Mov RR SUB |SUBW MOV PUSHW | BNC/B} 
R3,A i A, # A,ext PS tel 

Ew 0 X pe 
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Fig. 5—Piggy back devices interfacing the emulater. 


upper part of the package in the piggyback 
mode. 

Therefore, a piggyback-chip for evaluation 
can be used for both EPROM-installed packaging 
evaluation and break and trace debugging by 
connecting the emulator (see Fig. 5). 

In addition, the debugging efficiency in 
realtime processing was improved by adding 
functions that enabled the internal time base 
timer to temporarily halt when a break occurred. 
6) Test 

The CPU runs in the test mode according to 
the contents of the third byte of the vector as 
explained in article 4). An important feature of 
the test mode is that the bus connected to the 
CPU inside the device is disconnected from the 
CPU in this mode. Memory devices and pe- 
ripheral resources other than the CPU can be 
directly tested from the device pin via the bus 
without CPU instructions. This system achieves 
compression and portability of the test patterns. 
This system is especially suitable for manufac- 
turing systems such as for an ASIC that com- 
bines peripheral hardware devices according to 
user’s requests. 


3. Hardware design 
3.1 State machine and PLA 

This CPU design is based on a 5-bit state 
counter and a PLA control circuit that includes 
an instruction decoder. The PLA system was 
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Fig. 6—Optimization of PLA (The long routing area is 
reduced by adding one redundant decoding). 


used because it has only a few complicated 
instructions such as multiplication and division. 
The advantages of speed and cost were also 
considered. 

One advantage of the PLA design is that 
redundant decoding is possible according to the 
layout. For example, Fig. 6 shows that when the 
decoder output is located far from the last gate, 
the wiring area can be reduced by decoding at 
another decoder location. 

To achieve high-speed performance, a |-byte 
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Address bus 


Data bus 


Fig. 7—Block diagram of CPU. 


pl 


R: Resistive inverter 


a) Reduced latch circuit 


CX Cx—4 
I ( 
: = i- 0 
Cc c4 


Ic : p-channel transistor 


IC : n-channel transistor 


b) Clock latch circuit 


Fig. 8—Static latch circuits. 


queue is provided and instructions are prefetched 
in the non-memory cycle. The queue was not 
made any longer because of costs and to prevent 
debugging from getting too complicated. 
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3.2 Bus and register 

Figure 7 shows the configuration of the re- 
gister and ALU inside the CPU. The accumulator 
(A) and ALU are connected using two 16-bit 
buses (A and B). A and T are actually placed at 
the edge of the RAM. The CPU and RAM are 
connected using a 32-bit bus and can be saved 
and returned at high speed during interrupt. 
Therefore, although the RAM has a circuit 
configuration that can be accessed from both 
the bus and CPU, it is not a dual port RAM 
because it cannot be accessed simultaneously. 


3.3 Static design 

The design goal of this CPU is to have a high- 
speed area within its application range and to 
include portable equipment in its applicable 
range. Thus, the CPU must have a wide range of 
operation speed (including low speed) and low 
voltage and low current operation. A static 
design was used to satisfy these requirements. 
For example, Fig. 8 shows that the design uses a 
latch circuit to maintain the static characteristics 
without increasing the size of the circuit. 


3.4 Test circuit 

As explained in chapter 2, this CPU enters 
the test mode using the third byte instruction 
when it is started. In the test mode, the CPU can 
externally drive a bus by setting the bus buffer 
in tri-state. 

The interrupt test is also important because 
an interrupt can be caused by many factors. In 
the conventional test method, a great number of 
test vectors must be recreated each time the 
peripheral resource is changed. This is because 
the CPU uses a multiple interrupt to make its 
levels programmable. To avoid these problems, 
we have provided an interrupt register and have 
separated the interrupt processing in the CPU 
from interrupts caused by peripheral resources. 
This method cnables portability of the interrupt 
occurrence test vector of peripheral equipment. 


4. Technology and performance 

Table 2 lists the CPU hardware specifications 
used in the actual design. Although a 1.5 wm 
CMOS process having two Al layers is currently 
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being used (design rules allow a reduction up to 
about 1.0 um), manufacturing using a 1.3 um 


the 
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continuing. We plan to work further on a 
method to enable quick development of an 
ASIC microcomputer to meet the user’s require- 


process is being evaluated. Currently, 
12 MHz clock (330 ns cycle) is the maximum 
operating frequency. We expect to achieve 
20 MHz by applying the 1.3 wm process. 


5. Application and products 

Using the completed F7MC CPU core, we 
developed various types of general-purpose and 
dedicated microcomputers’. Table 3 lists the 
F?MC series products. When developing these 
products, we reduced the layout area by using 
the circuit elements shown in Fig. 8 in the 
peripheral hardware devices to be mounted. 
We also reduced the development turn around 
time by sharing general-purpose peripheral 
devices such as timers and Universal Asynchro- 
nous Receiver Transmitters (UARTs). Figure 9 
shows an example of the products. 

Development of various types of products is 


Table 2. Hardware specification of F*MC series 


Item Contents 


18 000 transistors 
10 000 transistors 
1 700 transistors 


PLA 
Resister, ALU, random logic 
Interrupt control 


ments. 


The F?MC CPU core design can be used to 
developed such an ASIC microcomputer. 


6. Conclusion 


This paper describes in detail the F?MC 


microcontroller 


Internal data bus 


Interrupt 


Peripheral 
circuits 


Interrupt 


developed by Fujitsu. 


Its 


Selecter 


#15 


CPU 


Interrupt 
control 


Process 1.5 um, 2-layer metal 
) 
Area 11.8 mm Fig. 9—Interrupt test circuit diagram. 
Table 3. Products of F*MC series 
Product ROM RAM Timer Serial I/O Seapine! Package | Application 
number | | hardware | 
+ ———a— — + | = T a 
| 8-16 Kbytes | 256-512 bytes | | 
MB897LY (internal) | (internal) 16-bit x 4 | UART | PWM timer | 64 SDIP | General 
64 Kbytes | 64 Kbytes (input and output) 8-bit x 1 | A/D converter 64 QFP | purpose 
(external) (external) | | | | 
| 8-16 Kbytes | 256-512 bytes | 8-bit x 1 | A/D converter | | 
ERY TSA (internal) (internal) | 16-bit x 1 - VED driver BASIE | VED cwplay 
— > - + —<—————— ——————— +— Se + + 
| 16 Kbytes $12 bytes 
(internal) (internal) 8-bit x 2 UART | A/D converter | 
MB8976X' ca kbytes | 64Kbytes | 16-bit x 1 | 16-bit x1] PWMtimer | 89QFP | Servo control 
(external) (external) | 
Remote- and | 
16 Kbytes | 512 bytes 8-bit x | : disk-control | ’ 7 
MBSri2s (internal) | (internal) 16-bit x 1 STILS circuits | SORE | Wildes tisk 
| A/D converter | 
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architecture reflects the market’s requirements 
for microcontrollers. The F* MC uses the most 
advanced design and technology and is also 
optimized for use in ASIC microcontrollers. 

Products have already been developed for 
some application areas, and the CPU core will 
continue to be used in various applications in 
the future. 
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The technological trend toward higher integration and expanded functions of microproces- 
sors has become increasingly important in the design of workstations and embedded con- 


trollers. 


This paper outlines the Gyyjcro F32 and key technologies for its development. 


1. Introduction 

The functions and performance of infor- 
mation processing units such as the latest 
office computers and workstations has recently 
greatly improved. The performance of con- 
ventional microprocessors is insufficient for 
them to form the nucleus of these devices. 
Thus, development of 32-bit microprocessor 
family products having high performance 
and expanded functions to meet the needs of 
the 1990’s is required. 

Fujitsu. has developed the Gyicro F32 
32-bit microprocessor family products to meet 
these needs. 

This paper describes an outline of these 
products and the key technologies required 
for their development. 


2. Background and deveopment objectives 
Dsl Gmicro F32 family 

The Gmicro F32 family products make up 
a total system consisting of a high performance 
32-bit microprocessor unit (MPU), peripheral 
LSIs, support software, and board computers”. 
This MPU is the first one having architecture 
that conforms to the realtime operating system 
nucleus (TRON) proposed by Sakamura!?~?). 

To broadly meet the future requirements 
of the diversified market for 32-bit micro- 
processors, three versions (high performance 
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version, standard version, and low cost version) 
of the MPU have been developed. Emphasis 
has also been placed on the compatibility of 
register sets, instruction sets, addressing modes, 
and data formats. 

The Gmicro F32 family peripheral LSI 
was developed to provide a system with a well- 
balanced configuration that makes use of the 
high performance Gyjcro MPU. 

The basic objective of the development 
of support tools was to provide an environment 
that is easy for the users to use by taking ad- 
vantage of the MPU’s features. 


3. Configuration of Gyicro F32 family prod- 

ucts 
3.1 MPU 

There are three current versions of the 
MPU: F32/300, F32/200, and F32/100. These 
MPUs conform to the TRON specification, and 
the register sets, instruction sets, addressing 
modes, and data formats are compatible be- 
tween each version”). The outline of each MPU 
is described below. Figure | shows the relative 
capabilities of each MPU. 

3.1.1 F32/300 

The F32/300 has the highest performance 
of this family of products because of the in- 
creased built-in cache memory capacity and the 
enhanced built-in memory management unit 
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High end market 12 MIPS 
Decimal 


instructions 
built-in MMU 


Low end market 
7 MIPS 
Instruction cache: 1 kbytes 


. built-in MMU 


4.5 MIPS 
ASIC core 


Performance (MIPS) 


300K 700 K 900 K 


Transistors 


Fig. 1—Positioning of Gyjcro MPUs. 


(MMU) functions. This product also has en- 
hanced decimal instructions for the mini- 
computer-class throughput-oriented market and 
can effectively execute COBOL??. It contains 
approximately 900000 transistors and can 
execute 12 MIPS (at 20 MHz operation) in the 
EDN benchmark. 

3.1.2 F32/200 

The F32/200 is the intermediate perform- 
ance product. It also has an internal 6-stage 
pipeline structure and incorporates several 
types of cache memory. This product contains 
approximately 700 000 transistors (see Fig. 2) 
and can execute 7 MIPS in the EDN benchmark. 

3.1.3 F32/100 

The F32/100 has good performance at a 
low cost. It is supported only by real storage. 
This product contains approximately 300000 
transistors and can execute approximately 
4.5 MIPS in the END benchmark. 

The biggest difference between the F32/100 
and the two family products described earlier is 
that the F32/100 incorporates the peripheral 
functions in the chip with the MPU to form the 
core. Therefore, this product can satisfy the 
market trend towards application specific ICs 
(ASICs). 


SZ Gmicro F32 peripheral LSI 


Table 1 lists the Gyicro F32 family LSIs 
that are now being developed. Figure 3 shows 
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Fig. 2—F32/200 MPU (chip size: 14.0 x 14.1 mm?) 


an example of a system configuration com- 
prised of these LSIs. The peripheral LSI devices 
connected to the system bus are treated as 
system components; this is a new concept. This 
enables a high-performance total system to be 
constructed taking advantage of the yearly 
improvements in the performance of the MPU 
resulting from, for example, the use of built-in 
cache memory and a pipeline control system. 

The concept of system components includes 
the MPU, peripheral LSI devices, and various 
support tools. The system components are 
designed to be used as components in which 
the LSIs themselves determine the system 
instead of the device products. This differs 
from the image of a component currently held 
by conventional system designers. 

Consequently, the system designer can 
easily design an optimum system having good 
performance by combining these system com- 
ponents. 

The outline of each LSI is described below. 

3.2.1 Floating point Processing Unit (FPU) 

The FPU executes high speed arithmetic 
operations and floating point operations of 
functions as the MPU coprocessor. The FPU 
has a special high speed communication protocol 
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Table 1. GmIcRo F32 family LSI 


— < Product Function Transistor Technology Package 
F3 2/300 32-bit MPU: high end version 900K CMOS 1.0um | PGA-179 
20 MHz/12 MIPS Al 3-layer 
MPU F3 2/200 32-bit MPU: standard version 700K CMOS 1.0 um PGA-135 
20 MHz/ 7 MIPS Al 2-layer 
F3 2/100 32-bit MPU: low end version —300K CMOS 1.0um | PGA-135 
_ — —_ 20 MHz/4.5 MIPS ; Al 2-layer QFP-160 
FPU 32-bit floating point processing unit |} —550K CMOS 1.0 um 
Al 2-layer PGA-135 
DMAC 32-bit DMA controller 380K CMOS 1.2 um 
e z ; peal 2-layer - PGA-1 To 
IRC 32-bit interrupt request controller 9 200 CMOS 1.2 um SDIP-64 
Peripheral Al 2-layer PLCC-68 
LSI CPG Clock pulse generator —100 Bi-polar 
DIP-12 
TAGM Tag memory | 220K CMOS 1.0 um 
| Al 1|-layer PGA-64 
CCM Cache controller & memory —1200K CMOS 1.0 um 
Al 2-layer PGA-124 


Programmable timer 


Gyicro/FPU 


Cache memory control 
Main memory control 
BUS control 


Local bus 


File Communication Display 
controller controller controller 


Gyicro/ DMAC 


Standard Peripheral LSIs 


Remote Remote 
memory 1/O 2 


Fig. 3—Application system configuration of Gyjcro family products. 


Remote bus 
(Sub-channel bus) 


/100 is IRC 
U 


Remote 
Gyicro 1/O 1 


CP 


[se teal © Gireso LSI 


with the MPU. The command sets from the tion sets (reserved from the start). 
MPU to the FPU are padded in the MPU instruc- This FPU provides the main features such as 
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Fig. 4—F32/DMAC (chip size: 11.14 x 10.46 mm”). 


the product and sum operations required for 
image display and the area decision instructions 
that are effective for clipping processing. 

3.2.2 Direct Memory Access Controller 

(DMAC) 

This controller supports data transfer 
between a peripheral device and memory or 
inter-memory data transfer (see Fig. 4). There 
are four channels, and each channel can operate 
independently. Transfer units of 8, 16, 32, and 
64 bits can optionally be selected and can be 
transferred between two separate buses in the 
remote and subchannel modes. The subchannel 
in which each MPU exists in the two buses 
incorporates the interrupt control feature used 
for inter-MPU communication and the message 
buffers. 

Figure 5 shows the internal block diagram. 
The DMAC consists of three units: transfer 
request control unit, micro sequence control 
unit, and data handler unit. The gate control 
PLA section performs the complex control 
of the micro sequence control unit that con- 
trols the transfer. The PLA section degrades 
the performance. In other words, the design 
load is increased if the section is composed of 
random logic, also the step is increased and 
performance degrades if it is composed of 
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signal Transfer 
operation 
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Transfer 


request : ; 
cnt] unit Micro sequence cnt] unit 


Fig. 5—Block diagram of F32/DMAC. 


microprograms. 

For basic operation, when the transfer 
request control unit accepts a transfer request 
from an external microprocessor, the unit 
sends a transfer start request to the micro 
sequence control unit. The micro sequence 
control unit uses the error test section to check 
the validity of the transfer request parameter. 
If the transfer parameter is valid, the micro 
sequence control unit uses the gate control 
PLA section to activate the transfer sequence 
unit. The DMAC acquires the bus rights to 
execute the DMA transfer. The bus arbitration 
control section of the data control unit executes 
this processing according to the indication of 
the transfer request control unit. The bus 
arbitration control section issues a bus rights 
arbitration request to the bus master, obtains an 
acknowledge signal, and acquires the bus rights. 

After data transfer is started, the counter 
control section updates the transfer source 
address and transfer destination address. In 
this case, the data control unit can use the 
internal 4-byte data holding register (data 
swapper) to freely reassign 4-byte data in units 
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of bytes according to the indication of the 
gate control PLA section. 

The gate control PLA section monitors the 
remaining transfer byte counts of the counter 
control section. When the counts are set to zero, 
the PLA section terminates the transfer opera- 
tion. 

As a result of the above configuration, a high 
speed data transfer of 27 Mbyte/s at a clock 
frequency of 20 MHz is achieved. 

3.2.3 Interrupt Request Controller (IRC) 

Figure 6 shows a fabricated IRC chip. The 
IRC expands interrupt requests and generates 
interrupt vectors in a system that uses the MPU. 
A single IRC can expand up to seven interrupt 
requests and can perform multiple interrupt 
processing at high speed using the high speed 
daisy chain connection. The IRC can set the 
interrupt levels for each interrupt input, trigger 
mode and vector processing, and support the 
mode in which as interrupt vector corresponds 
to each function of the MPU. 

Figure 7 shows the internal block diagram of 
the IRC. The IRC consists of seven sections: 
local interrupt input, bus interrupt input, level 
conversion, interrupt output, acknowledge 
control, vector generation, and bus interface. 

3.2.4 Clock Pulse Generator (CPG) 

The CPG provides the MPU and peripheral 
LSI devices with the standard operation clock 
signals. The base oscillating frequency is 
40 MHz. 

3.2.5 Tag Memory (TAGM) 

TAGM enables the 2- or 4-way set associ- 
ative configuration by selecting the external 
terminals. It also enables a high speed access of 
27 ns from the address to the bit information 
output. A high performance and compact cache 
memory system can be constructed by com- 
bining another high speed SRAM and control 
circuit? (see Fig. 8). 

3.2.6 Cache Controller and Memory (CCM) 

The CCM consists of tag data memory, other 
circuits, and control circuits. It has a 16-Kbyte 
physical address capacity. The memory con- 
figuration is 4-way set associative. The main 
storage is updated in the write through mode 
that always writes new data in main storage. 
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Fig. 6—F32/IRC (chip size: 4.05 x 3.75 mm? ), 
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Fig. 8—F32/TAGM (chip size: 4.68 x 8.76 mm?) 
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Table 2. Gyyicro development support tools 


Cross development support tool 


Resident development support tool 


- 7 UNIX OS 
é 6: 
pps Cross machine te 
Cross support he re, VAX/VMS COOL 
software - 2, onal M series 
Librarian Software Modula-2 
z FM R series 
Simulator debugger Wirlataton Assembler 
Load Module converter Linker 
=e ae = Librarian 
Host emulator Debugger 
System (Software) ITRON OS 
debugger In-circuit emulator Handware Board computers 
(Hardware) CPU, memory, file, 1/0, LAN 


The CCM is used in the BUS watch mode 
that detects when another bus master such as 
DMAC overwrites memory. 


4. Support tools 

Table 2 lists the Gyicro F32 family sup- 
port tools. These tools are as follows: the cross 
development support tools for developing 
programs using the host computer, an emulator 
that debugs the system on the target system, and 
the resident development support tools con- 
structed on the board computer that installs this 
family of microprocessor. 

These support tools enable effective develop- 
ment of today’s increasingly large-scale software. 
They are designed to take advantage of the 
features of the Gyjcro family products. 

The cross development support tools include 
a C compiler which enables debugging and pro- 
gramming to be performed integratedly using a 
high-level language, mainly at the C language 
level®?. 

Figure 9 shows the development procedure 
when the user uses these support tools to 
develop software. 

The features and outlines of the main 
support tools are described below. 


4.1 C compiler 

The language specification conforms to 
the American National Standards Institute Draft 
X3J11 (ANSI Draft) specified as the inter- 
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Fig. 9-Gyyicro F32 programs development procedure. 


national standard. This complier also has many 
optimization levels and is designed to generate 


efficient object codes by using 
and FPU features. 


the various MPU 
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The specific optimization items are as 

follows: 

1) CONSTANT FOLDING 

2) COMMON SUBEXPRESSION ELIMINA- 
TION 

3) UNUSED DEFINITION ELIMINATION 

4) STRENGTH REDUCTION 

5) BRANCH TAIL MERGING 

6) REGISTER COLORING 

7) PEEP HOLE OPTIMIZATION. 


4.2 Assembler 

The assembler notation is based on the IEEE 
standards and takes the improved program 
description into consideration. 

Its features include high-level language type 
structured assembly functions such as FOR 
and WHILE and SWIT-CHCASE statements. The 
macro function and _ conditional assembly 
functions are also included. 

Instructions having a high frequency of 
usage have multiple formats for shorter MPU 
instructions. The assembler has a function to 
automatically select the shortest and fastest 
instruction formats for the same instruction 
description to make use of the instruction 
formats mentioned above. 


4.3 Simulator debugger 

The simulator debugger executes MPU and 
FPU instruction operations of the Gyrcro F32 
family on a pseudo virtual system of the host 
computer. 

Figure 10 shows the configuration of the 
virtual system section. 

The virtual MPU and FPU simulate the MPU 
and FPU instruction operations. The virtual 
MMU simulates the MMU function and operates 
in the same way as the actual MMU for address 
conversion and memory protection. The com- 
bined virtual MPU and MMU sections corre- 
spond to the actual MPU. 

The virtual memory, interval timer and 
system call functions simulate memory, inter- 
rupt and I/O operations, and the basic hardware 
installed on the target board. 

The simulator debugger can sample the exe- 
cution flow and access results and the internal 
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Fig. 10—Virtual system section of simulator debugger. 


states of the MPU and FPU in detail. It can also 
output various types of information when 
execution of the program is interrupted or 
terminated. 
Typical examples are: 
1) Display of the trace results (execution 
history ) 
2) COorC 1 coverage. 


4.4 Emulator 

The emulator debugs the program and 
system by connecting the target system. 

The main functions of the emulator include 
the alternate execution of the MPU and memo- 
ry, and execution control. 

Since this emulator performs the highest 
speed of MPU emulation and enables debugging 
of the section in which timing is required for the 
target system, both the hardware and software 
can be easily verified at the highest speed of 
emulation. 

It can set various break conditions such as 
instruction break, operand break, and bus break 
for the execution break function using the 
debug support function provided by the MPU. 

The emulator can set break-points for both 
logical and physical addresses in the logical space 
program. It also supports debugging of the 
multiple logical space system. 


5. Key technology 


The key technologies for the development of 
this LSI (especially peripheral LSI) are intro- 
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duced below. 


5.1 Process technology 

This extremely important technology im- 
proves performance by reducing the delay time 
per gate, reduces the gate length of the MOS 
transistor for large scale integration, and effec- 
tively uses the wiring layer. Process technology 
is especially required for an MPU having many 
elements. The F32/300 uses the advanced wafer 
process technology of the CMOS 1.0 um gate 
and uses three metal wiring layers. 


5.2 Design method 

The Gyicro F32 family LSI design method 
combines the advantages of the full custom de- 
sign method (high density packaging) and the 
standard cell design method (short development 
time). In a block such as the register and ALU in 
which repetition occurs frequently and _ per- 
formance greatly affects layout design, the full 
custom design method is mainly used. The 
standard cell-like short turnaround automatic 
wiring design method is used in random logic 
blocks such as the control circuit. 


5.3 PLA technology 
The design method used for PLA has the 

following advantages compared to the design 

method used for random logic: 

1) Since the design method used for PLA has 
good uniformity, it can reduce the area 
occupied by the metal wiring and has better 
element density than the random logic 
design method. 

2) The design method used for PLA can easily 
be used for large changes in the design. 

Due to advances in process technology and 

circuit technology, large scale PLA that can 

respond at high speed was realized on the 

LSI. 

The largest scale PLA was developed for the 
DMAC having 28 bits of input, 30 bits of out- 
put, and 3411 product terms. The response 
speed of this PLA was 12.1 ns (typical) from the 
input of the AND array driver to the output of 
the OR array sense amplifier. 

The compression and dividing technology 
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Auto routing 


becomes important in PLA development. The 
number of product terms of the DMAC can now 
be reduced from 7306 to 3411 by the Fujitsu 
compression tool and was divided into 20 PLA 
blocks. 

The gate scale was calculated assuming that 
the above PLA is designed using random logic. 
This calculation confirmed that the PLA for 
3411 product terms corresponds to a random 
logic block of approximately 15000 or more 
gates. When random logic is used, it was con- 
cluded that the PLA design method is better 
than the random logic design method from 
the viewpoint of layout area. The proportion 
occupied by the wiring area in addition to the 
area occupied by the original gate increases 
according to the increase in the number of 
elements. 


5.4 Design automation (DA) technology 

In the verification stage of logical design, a 
mix level simulator that can verify dynamic 
transistor circuits even when gates and transis- 
tors coexist was developed. 

For layout design, an intra-block and inter- 
block automatic wiring tool was developed. The 
reliability of the LSI design was improved sig- 
nificantly by fully automating the verification of 
the layout pattern. 

The entire design time was reduced to ap- 
proximately two thirds that of the conventional 
layout design method. Figure 11 shows the 
effect of the reduced layout design time. Full 
DA verification of the automatic wiring and 
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logic diagram layout pattern that uses the layout 
design method based on cells greatly contributed 
to the reduction of the design time. 


6. Future product expansion plans 

The Gyicro F32 family products will be 
expanded emphasizing the following points: 

1) Upward and downward expansion of the 

MPU 

Performance requirements are increasing 
year by year and the advent of microprocessors 
having a 100 MIPS processing speed is expected 
by 2000 A.D. These requirements will also 
apply to the Gmicro F32 microprocessor 
family. As the architecture is improved and the 
technology advances, high-level processors than 
the F32/300 will be developed in the future to 
improve the current MPU functions and per- 
formance. 

It is expected that MPU chips that can use 
peripheral circuits (because of high integration 
technology based on improved semiconductor 
technology) will form an important future 
market independent of improvements in per- 
formance. 

For this purpose, downward expansion and 
broader applications of multifunction micro- 
processors with excellent cost effectiveness 
based on the Gyicro F32/100 MPU core will 
be developed to meet the ASIC requirements. 

2) Expansion of peripheral LSI 

MPU performance greatly depends on the 
functions and performance of the peripheral 
LSI. Peripheral LSI devices having improved 
functions such as image display, processing 
controllers, and bus peripheral control LSI 
devices will be developed in the future to 
improve the throughput of the entire system. 
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3) Enhanced support tools 

The 32-bit microprocessor software will be 
made large-scale and more complex. 

How well a high-reliability program can be 
quickly developed greatly depends on the state 
of the development support environment. All 
support tools will be further enhanced and 
advanced by improving man-machine interfaces 
and providing all the necessary support tools. 


7. Conclusion 

The concept of the Gyrcro F32 family 
products and an outline of the products have 
been described. 

The demands of the microprocessor market, 
especially for 32-bit microprocessors based on 
TRON architecture, will continue to expand. 

Related products will be sequentially de- 
veloped and provided at the same high level of 
quality and performance. Thus, the total system 
will be supported. 
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Lightwave semiconductor devices are one of the keys to building lightwave communication 
systems. In this paper Fujitsu lightwave semiconductor devices now being produced, are 
reviewed according to individual device characteristics and system applications. 


1. Introduction 

The use of lightwave communication systems 
has grown rapidly. Their use in different types 
of systems such as high-speed, long-haul trans- 
mission systems, local area networks (LAN), 
and subscriber loop systems has also been 
carefully studied’). A decision as to the type of 
lightwave device to be used in the system must 
be made. That is, whether to use a Light Emitting 
Diode (LED) or Laser Diode (LD) as the emitter, 
and whether to use a PIN Photodiode (PIN 
PD) or Avalanche Photodiode (APD) as the 
detector must be decided. The diode having the 
characteristics required for the optical com- 
munication system must be selected according 
to cost. (Because the characteristics and cost 
between a LD and LED or between a PIN PD 
and APD differ greatly.) 

Fujitsu. lightwave semiconductor devices 
can be used in a wide range of lightwave com- 
munication systems from LANs to multigiga 
bit transmission systems, to submarine cable 
systems which require high reliability. 

The development of Fujitsu’ lightwave 
semiconductor devices closely reflects the 
development of lightwave communication sys- 
tems. At first 0.8 um wavelength range devices 
‘such as GaAlAs lasers””’?), LEDs* and silicon 
PDs*) were developed. However, communication 
systems using the 1.2 um to 1.6 wm wavelength 
range are becoming more important for large- 
capacity, long-haul communications than com- 
munication systems using the 0.8 um _ wave- 
length range because optical fibers have lower 
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loss and minimum dispersion in this range®””), 


High-speed, long-haul lightwave communication 
systems using this wavelength range have been 
developed. For these applications, Fujitsu has 
produced InGaAsP LDs and LEDs; InGaAs 
PDs and APDs; and Ge PDs and APDs. Charac- 
teristics of the devices, especially 1.2 um to 
1.6 ym wavelength range devices, are described 
individually in the following chapters. 


2. Emitters 

LEDs and LDs are important because their 
appearance has enabled lightwave communi- 
cations. LEDs are simple electroluminescent 
devices. They are economical and easy to use. 
LEDs are used in small-capacity, short-distance 
communication systems. Laser diodes emit 
coherent light and are used in all lightwave 
communication systems from LANs to multi- 
giga bit transmission systems. 


2.1 Laser diodes 

Fujitsu. manufactures two types of LDs. 
One is a Fabry-Perot (FP) laser and the other 
is a distributed feedback (DFB) laser. FP lasers 
emit multi-mode light and are used in a few 
hundred mega-bit systems. DFB lasers emit 
single-mode light and are used in multi-giga bit 
systems. Fujitsu’s FP LD is referred to as a V- 
grooved substrate buried heterostructure laser 
(VSB LD), and DFB LD is referred to as a flat 
surface buried heterostructure distributed feed- 
back laser (F BH-DFB LD). 
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2.1.1 VSB LD 

The laser characteristics required for light- 
wave communication systems are low threshold 
current, high efficiency, stable transverse and 
longitudinal mode oscillation, high speed, and 
reliability. These characteristics are fulfilled 
by a VSB LD. 

Figure 1 shows a schematic structure of the 
VSB LD®). A crescent-shaped active region is 
buried in the V-shaped groove. The surface of 
the V-groove is exactly (111)B InP surface. 


p-side electrode 


Dielectric film 


p-InGaAsP 


p-InP 


InGaAsP 


n-InP 


p-InP 


n-side electrode 


Fig. 1—Structure of VSB-LD. 


Relative light intensity 


Angle (deg) 


a) Parallel 


This fact is important to high reproducible 
fabrication and excellent transverse mode 
characteristics. A  p-n-p-n structure is used 
outside the active region to effectively confine 
the injection current in the active region. For 
high speed applications, a mesa structure is 


Optical output power (mW/facet) 


0 50 100 150 
Forward current (mA) 


Fig. 2—Optical output characteristics of VSB-LD for 
various operating temperature T.. 


Angle (deg) 


b) Perpendicular 


Fig. 3—Far field patterns of VSB-LD. 
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3 4 5 6 


Testing time (kh) 


Fig. 4—Result of reliability assurance test for VSB-LDs. 


F.P. type 1300 nm 


FLD130D4BK 


FLD130D4SJ-A 


FLD150D4HM 


1550 nm 


FLD150D4BJ 


FLD150D4BK 


FLDI150D4SJ-A 


1310 nm 


FLDI30F1BJ 


( Distributed 


feedback FLDI30F1SJ-A 


FLDIS0F1BJ 


1550 nm 


FLDI50FISJ-A 


Fig. 5—List for main products of Fujitsu’s LD. 


applied to reduce the capacitance associated 
with the p-n junction in the current confinement 
layers. The width of the mesa is optimized to 
15 pm. 

Figure 2 shows typical I-L characteristics. 
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Fig. 6—Laser packages. 


The CW threshold current and slope efficiency 
at 25°C are 15 mA and 0.25 mW/mA. Even at 
100 °C, stable CW operation is observed. 

Figure 3 shows far field patterns both parallel 
and perpendicular to the junction plane. The far 
field patterns are smooth and there is no peak 
shift from low output power to high output 
power. The full widths at half maximum of the 
far field patterns are 25° for parallel to the 
junction plane and 32 ° for perpendicular. 

Figure 4 shows the result of the reliability 
assurance test for VSB LDs under the aging 
condition where the LDs are driven at a constant 
power of 5 mW at 50 °C. Stable operations are 
observed and high reliability is expected. If the 
laser life is defined as the time up to when the 
operating current reaches 1.5 times the initial 
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value, the median lifetime is estimated to be 
5x 10° h at 50°C by assuming linear increase 
in the operating current. 

As described above, a VSB LD has enough 
lasing characteristics and reliability for a few 
hundred mega-bit lightwave communication 
systems. Figure 5 is a list of the various types 
of packaged devices sold by Fujitsu. Figure 6 
shows some of these packages. 

2.1.2 FBH-DFB LD 

The laser characteristic required for a bit 
rate exceeding 1 Gbit/s is exceedingly stable 
single longitudinal mode operation because of 
color dispersion of silica optical fiber. The 
FBH-DFB LD has been developed for such 
systems. The FBH-DFB LD shows high fabrica- 
tion yield, stable single mode operation, high 
frequency response, low noise property, and 
high reliability. Both 1.3 um and 1.55 um 
wavelength FBH-DFB LDs are available. 

Figure 7 shows a schematic structure of the 
FBH-DFB LD”?. 

A first-order corrugation is formed on the 
n-InP substrate. The corrugation periods are 
200 nm for 1.3 wm, and 240 nm for 1.55 um. 
The currugation depth is 40 nm, which optimizes 
a coupling efficiency of the feedback. Further- 
more, a p-n-p-n structure outside the active 
region and a mesa structure are also employed 
for the same reason as the VSB LD (to restrict 
the current to the active layer). The features of 
the FBH-DFB LD are as follows: 

1) Side walls of buried active region mesa are 
shifted from (111)A InP surface to achieve 


Electrode (+) 


SiO, 


InP substrate 

p-InGaAsP 

p-InP 

InGaAsP 
(active layer) 

n-InGaAsP 
(guide layer) 


AR coating film Electrode (—) Corrugation 


Fig. 7—Structure of FBH-DFB LD. 
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low threshold current, high quantum effi- 

ciency, and high fabrication yield. 

2) Thickness and carrier concentration of the 
current confinement layers outside the 
active region are optimized to reduce leakage 
current. 

3) The strength of distributed feedback (kL) 
are optimized; kL =0.5-1.0, which enables 
high single mode operation yield even at 
high output power. 

Figure 8 shows the CW I-L characteristics 
of the 1.3 wm FBH-DFB laser at various tem- 
peratures. A CW threshold current of 15 mA 
and slope efficiency of 0.3 mW/mA are achieved 
at 25 °C. 

Figure 9 shows the lasing spectra at 25 °C. 
Exceedingly stable single longitudinal mode 
operation with side mode suppression ratio of 
35 dB is attained under CW condition and 
3 Gbit/s modulation. 

Figure 10 shows the pulse response of a 
narrow mesa type (5 um _ width) FBH-DFB 
laser. A rise time and a fall time of less than 
100 ps are attained. A 3 dB bandwidth at 5 mW 


40 


3 


Optical output power (mW) 


10 


0) 60 120 180 
Forward current (mA) 


Fig. 8—Optical output characteristics of FBH-DFB LD. 
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a) CW spectrum 


Relative intensity (10 dB/div) 


1300 1310 1 320 
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b) Modulated spectrum with 3 Gbit/s NRZ signal. 


Fig. 9—Optical spectra of FBH-DFB LD. 


output power on small signal frequency response 
is more than 6 GHz. 

The low noise characteristic is also obtained 
as shown in Fig. 11. Relative intensity noise 
(RIN) at 8 mW as low as —160 dB/Hz is observed 
for the bandwidth of 900 MHz. 

These experimental results confirm that 
FBH-DFB lasers are most suitable for multi-giga 
bit optical communication systems. 

The result of the reliability test was fully 
satisfactory in the same way conventional 
VSB-LD. The FBH-DFB lasers show stable 
operation under high temperature aging tests 
in terms of not only operating current but 
lasing spectra. In particular, the expected life 
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Intensity (arb. unit) 


Time (ns) 
Fig. 10—Pulse response waveform of FBH-DFB LD. 
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Fig. 11—Relative intensity noise of FBH-DFB LD. 


under the operating condition of 5 mW at 
70°C is longer than 8.5 x 10° h'®. Figure 5 
shows the main products of the DFB lasers. The 
FBH-DFB lasers are used for large-capacity 
long-haul transmission systems, especially sub- 
marine cable systems. 


2.2 LED 

Although LEDs have a disadvantage of lower 
overall output than LDs, LEDs still provide the 
best cost performance in short-distance trans- 
mission systems. 

In addition to laser diodes, LEDs are also 
candidates for the light sources used in optical 
LANs and optical data link systems. Like FDDI 


349 


H. Yonetani et al.: Lightwave Semiconductor Devices 


n-electrode 


n-InP substrate 
n-InP buffer 
layer 
pt-InGaAsP 
active layer 
p-InP cladding 
layer 
pt-InGaAsP 
contact layer 
Dielectric film 


p-electrode 


Plated Au 


Fig. 12—Structure of high speed LED. 


Spherical lens 


Mini-lens cap 
LED chip 


Fig. 13—Structure of high-speed LED package. 


systems, the demand for low-cost, high-speed 
light sources requires the development of high- 
speed InGaAsP LED!”. 

The InGaAsP LED was thus developed to 
meet this need. Figure 12 shows the device 
structure. The InP/InGaAsP layers are grown on 
an n-InP substrate by liquid phase epitaxy (LPE). 
The growth sequence of each layer is as follows. 
The first layer is an n-InP buffer. This layer is 
followed by a p*-InGaAsP active layer with the 
alloy composition adjusted to 1.3 wm emission. 
The next layer is a p-InP cladding layer. The 
final layer is a p'-InGaAsP contact layer. 

To improve the response speed, the InGaAsP 
active layer is heavily Zn doped (8 x 10!8 cm7?) 
to reduce the injected carrier lifetime. The size 
of the emitting area has also been reduced 
(diam = 22 um, thickness = 0.5 um). The small 
emission diameter enables highly efficient 
coupling of LEDs to optical fiber cables and 
compensates the deterioration of the optical 
output caused by high Zn doping. 
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Fig. 14—Coupling characteristics, 


The method of coupling an LED chip and 
multi-mode fiber (MMF) is an important tech- 
nique for obtaining sufficient and stable coupled 
power”), Figure 13. shows the unique LED 
package; a modified TO-46 case with a mini-lens 
cap fixed by harmetic sealing. This package 
can be effectively coupled to MMF directly 
without GRIN-lenses or spherical lenses to 
construct a simple, economical, and reliable 
module. 

The trace of ray is as follows. 

At first, the light generated in the emitting 
area is collimated through the spherical lens 
(n=1.5, diam=400ym) mounted directly 
on the chip. Next, the collimated beam is 
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Fiber: GI-50 (N.A. = 0.21) 


Fiber coupled power (”W) 


0 40 80 120 


Forward current (mA) 


Fig. 15—Output power characteristics for different 
temperature T,. 


converged through the second lens (m= 1.45) 
on the cap. Figure 14 shows the alignment 
tolerances. The focal length is 1mm from 
the tip of the second lens and is long enough 
to adjust the position when the chips are made 
into modules. For axial and lateral movement, 
0.5 dB loss corresponds to a displacement of 
0.2mm and +13 ym. These wide alignment 
tolerances have the advantages of easy module 
construction and high reproduction yield for 
connector insertion. 

The transmission ability of high-speed 
LEDs is determined by the total performance 
of coupled power, spectral width, and response 
speed. 

Figure 15 shows typical power launched 
to a graded-index (GI-50) fiber of 50/125 um 
and 0.21 N.A. as a function of injection current 
at various temperatures. The values of coupled 
power for GI-50 and GI-62.5 (0.275 N.A.) 
MMF are typically 22 uwW (—16.6dBm) and 
56uW (—12.5dBm) at 100mADC, 25 °C. 
The temperature dependence of the coupled 
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Relative optical output power 


Time (0.5 ns/div) 


Fig. 16—Pulse response of LED. 


Table 1. 


Absolute maximum ratings (7, = 25 °C) 
8 eee 


Specifications of high-speed LED 


Parameter Symbol Condition) Ratings | Unit 
ote Tec —  |-50 to +150] °C 
Operating a ea | y | APOE: eee 

temperature Top = G0 90) *C 
Forward current! ip cw i 120 /mA 
Reverse voltage | VR — 1 ia: 


Optical and electrical characteristics (T, = 25 °C) 


| Test : : 
Parameter | Symbol eondition | =m Typ. | max | Unit 
Peak | 
wave- Ap si 1 2701 30011 340) nm 
length [ 
| J i 
Spectral 
width AA | — 140; 160) nm 
(FWHM) | Ip= 
Optical 100 mA 
output P 0.15|0.25| — |mW 
power 
Alls | | = 
Power | 
launched | 
into GI- 1 IS | 22 | — | #W 
50 fiber 
+$— , Tt eee ee 
1 "res | 
Pe ff | 100mA + | | 
ut-o : 20mAp-p,| 
frequency| fe 1s 4B | 180 240 | — MHz 
from 
1 MHz 
= =i —= 
Rise time tr p= = hed | RS 
—$—— >> 0-100mA,— 
Fall time tr 10-90% | -— 2:9 - ns 
Forward | Ip= 
ottaan Ve 100 mA | 1.5 | 2.0 | V 
~ = T 
Capaci- f= 1 MHz, | 
‘ance Ct |VR=0V 300 | 400 | pF 


GIL-50 fiber: Graded-index fiber, core diam = 50 um, 
cladding diam = 125 wm, N.A. = 0.21 
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power is —0.51%/°C (normalized at 25 °C) 
at 100 mA up to 90 °C. 

The peak wavelength and the spectrum 
half width at 25°C are typically 1300 nm 
and 140nm at 100 mA DC. The temperature 
dependence of the peak wavelength and half- 
width of the spectrum is 0.35 nm/°C and 
0.2%/°C (normalized at 25 °C). 

By applying 100 mA step pulses without 
prebias to this LED, the rise time and fall time 
are measured to be 1.5 ns and 2.5 ns, as shown 
in Fig. 16. This high-speed response enables 
high-speed transmission up to 200 Mbit/s. 

Table 1 lists the LED © specifications. 
Figure 17 is a photograph of the LEDs. 


100 


Relative output power (7) 


on 


40 
0 500 


FED130K4PC FED130K4TF 


Fig. 17—High-speed LEDs. 
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Fig. 18—Relative output power asa function of aging time. 
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Since the system reliability is generally 
influenced by the reliability of the light source, 
the reliability of the light source must be fully 
estimated according to the life test. The large- 
scale acceleration life test was performed at 
ambient temperatures of 90°C and 120°C 
under a constant current of 120 mA (current 
density of 32 kA/cm?). The number of samples 
tested were 145 and 144 pieces, respectively. 
As shown in Fig. 18, sudden deterioration 
cannot be observed up to the aging time of 
2000 h. If the end of life is defined as the point 
where a 2 dB drop in output power is observed, 
the estimated values of median life at 90 °C and 
120°C are 6.5x 10*h and 1.4x 104 h. Thus 
the activation energy is 0.69eV, and the 
projected median life at 35°C and 100mA 
(duty 50percent) is 4.8 x 10° h. Moreover, 
19 FIT have been predicted for the products 
in several lots for a service time exceeding 
15 years. This outstanding reliability is achieved 
by structural simplicity, which in turn yields 
a high reproducibility in fabrication, and con- 
firms that this LED is suitable for applica- 
tions which require very high reliability, such 
as I/O channels of large computers. 


3. Photodiodes 

The performance limitation of germanium 
photodiodes was broken through by using 
InGaAs photodiodes. Because InGaAs photo- 
diodes have a lower dark current, lower multi- 
plication noise, and higher optical sensitivity 
at 1.55 um wavelength. As shown in Fig. 19, 
Fujitsu offers a full line-up of photodiode (PD) 
products with the addition of new InGaAs 
photodiodes, including PIN PDs for low operat- 
ing voltage and Avalanche photodiodes (APD) 
for higher sensitivity. Devices with differnt 
photosensitive diameters and with various 
housings are offered for various applications. 
Figure 20 shows the detector housings. 


3.1 InGaAs PIN PD 

Several operational advantages make InGaAs 
PIN photodiodes the most common detectors 
not only for the optical receiver front end but 
for the optical sensing head of measuring equip- 
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Material Structure Photosensitive Part number 
area in diam 


Photodiode 


FID13S51TX 
FID13S51JT 
FID13S51SR 
FID13S81WS 


~~ 
3 


FID13S81TX 


FIDI3S81JT 
FIDI3S81SR 
FID13S32WS 
FIDI3S32TX 


FID13S32TU 


FID13S13ST 


FID13S13TX 


Fig. 19—Photodetector products line-up. 


Fig. 20—Photodetector packages. 
(JT: With multi-mode fiber pigtail, 
TX: Modified TO-18 package, WS: Small 
precision package with sapphire plate window, 
and SR: Alumina chip carrier with low 


parasitic capacitance) 
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Fig. 21—Cut-away view of InGaAs PIN photodiode. 
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60 
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Fig. 22—Dark current characteristics of InGaAs PIN 
photodiode for different temperature T,. 


ment. The InGaAs PIN PD has the advantages 
of low operating bias voltage, low dark current, 
and high sensitivity at wavelength regions over 
1 pm (1.0-1.6 pm)!3)+14)_ The key technology 
for constructing this diode is chloride vapor 
phase epitaxy (VPE), which can produce high 
quality, very low carrier concentration InGaAs 
and InP layers’>?, 

Figure 21 is a cut-away view of an InGaAs 
PIN photodiode. A very low carrier concentra- 
tion n--InP buffer layer (Np <1 x 1045 cm7?) 
is grown on the n*-InP substrate by VPE, 
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Fig. 23—Eye pattern of 50 um diameter InGaAs PIN 
photodiode at 5 V and at the wavelength of 
1.55 um. The average incident optical power is 


—20 dBm and the signal is amplified by a broad- 
band amplifier. The upper pattern shows driving 
current wave of 1.55 um DFB-LD. 


followed by a n -InGaAs light absorption layer 
(Np = 1-2 x 1015 cm7~3, 2.0-2.5 um thick) and 
an n-InP window layer. The diode has a planar 
structure for long term durability and electrical 
stability. The surface of the diode is coated 
with plasma-CVD silicon nitride film for passiva- 
tion and antireflection. 

The incorporation of an InGaAs absorption 
layer and InP window layer provide nearly flat 
sensitivity within a 1 wm wavelength region up 
to 1.6 um. Sensitivity of 0.8 A/W is obtained 
even at the 1.55 ym wavelength. The InP 
window layer is also useful for reducing the dark 
current. Figure 22 shows the best characteristics 
of dark current of a 50 um photosensitive area 
diameter device at various temperatures from 
—10°C to 84°C. At room temperature, the 
measured dark current is as low as 10 pA 
at 5 V of the reverse bias voltage (typically 
50 pA). Due to the large band gap energy of InP 
(1.34 eV), the surface leakage current (genera- 
tion current) is effectively suppressed at the low 
bias voltage. The low carrier concentration of 
epitaxial layers achieves a low capacitance at a 
low operating voltage. For example, the result 
is 0.4 pF at 5 V operation for 50 um photo- 
sensitive diameter devices. The diode was 
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Fig. 24—Median life vs. reciprocal temperature. 


mounted on the 50-ohm micro strip line and its 
response was evaluated. The resultant eye pat- 
tern for 1.55 um pseudo-random optical pulse 
at 565 Mbits is shown in Fig. 23. The response 
speed is limited by the CR time constant. 

The reliability was investigated by conduct- 
ing accelerated aging tests at 100, 150, 200, 
and 227°C and 30V of reverse bias. The 
result of the median life was plotted as a 
function of the reciprocal temperature as shown 
in Fig. 24. The median life expected at 25 °C 
can be as long as 1.8 x 108 h when the activa- 
tion energy is assumed to be the worst value 
of 0.84 eV. 

Due to this extra-long life time and the 
refined production control the device has been 
applied to TAT-8 optical fiber submarine sys- 
tems’, The device technology has advanced 
to the point that the photodiodes are expected 
to be the major photodiodes in the coming 
wide-spread lightwave system market. 
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Fig. 25—Cross-sectional view of InGaAs avalanche 
photodiode. 


3.2 InGaAs APD 

InGaAs APDs with the InGaAs light absorp- 
tion layer separated from InP multiplication 
layer, have been considered to be the most 
effective detector for the receiver front end 
in the long wavelength region (1.3 wm and 
1.55 wm). A number of improved receiver sensi- 
tivity data from giga-bit rate optical transmission 
experiments using InGaAs APDs have been 
reported!” . 

A cross-sectional view of the InGaAs APD is 
shown in Fig. 25'®). The n-InP multiplication 
layer is completely embedded in the low carrier 
concentration n-InP layers to achieve a successful 
guard-ring effect. This type of structure is called 
the “buried structure”. The wafer is fabricated 
in two steps by liquid phase epitaxy (LPE). 
At the first epitaxial growth, the n-InP buffer 
layer, n-InGaAs light absorption layer, n-InGaAsP 
intermediate layer, and n/n--InP layers are 
grown on a (111)A faced InP substrate. After 
selective mesa etching of the top n-InP layer, 
the second growth is made and the mesa is 
covered with a low carrier concentration n™ -InP 
layer. The active pn junction and the guard- 
ring are formed by thermal diffusion of Cd 
and ion-implantation of Be, respectively !?), 

Figure 26 shows the photocurrent multiplica- 
tion characteristics at 1.55 um. The separated ab- 
sorption and multiplication layer configuration 
results in unique multiplication characteristics. 
Onset of photoresponse is observed around 
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Fig. 26—Photocurrent multiplication characteristics and 
spatial distribution of multiplication of InGaAs 
APD. 


30 V of reverse bias. At this bias, the depletion 
layer reaches the n-InGaAs absorption layer 
and photogenerated carriers (hole) are injected 
into the multiplication layer. As the bias voltage 
is increased, the multiplication factor increases 
and the maximum attainable multiplication of 
50 is achieved at primary photocurrent of 2 WA. 
A quantum efficiency of more than 80 percent 
at 1.55 um -is obtained. The inset of Fig. 26 
shows a two dimensional spot scanned photo- 
response at multiplication factor M=10. One 
advantage of a burried structure’) is a 
uniformity of multiplication distribution in 
the photosensitive area. 

A thin InGaAs intermediate band gap layer 
is inserted between the InP and InGaAs layers 
to buffer the large valence band discontinuity. 
This is the main design to achieve a wide band 
width by eliminating the hole pile-up effect 
at the hetero-interface. 
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Fig. 27—Frequency response of InGaAs APD at M=5, 


10, 20, and 30 on wavelength of 1.55 um, diode 
sesitive area is 5O um. 


10 


or 


A = 1.55 wm 
1 Ry, =50 Q 
Gain-bandwidth product = 30 GHz 


Cut-off frequency (GHz) 


1 10 100 
Multiplication factor (M) 


Fig. 28-Multiplication factor vs. cut-off frequency. 


The small signal frequency response of 
50 wm photosensitive area diode at various 
multiplication factors is shown in Fig. 27. 
The load resistance was 50 ohms and the light 
wavelength was 1.55 um. In Fig. 28, the cut- 
off frequency is plotted as a function of the 
multiplication factor. Even at a low multiplica- 
tion region (M=5), the cut-off frequency 
exceeds 3 GHz. Above M=10, the cut-off 
frequency is limited by the avalanche built-up 
time. The linear extrapolation to unity multi- 
plication gives a gain-bandwidth product of 
30 GHz. These feasible frequency responses as 
well as a high quantum efficiency at 1.55 wm are 
characteristic of InGaAs APDs. Low multiplica- 
tion noise is another characteristic advantage 
of the InGaAs APD because of the comparative 
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Fig. 29—Results of aging test. 


large hole to electron ionization coefficients 
ratio of InP. The excess noise factor is about 
5 at M = 10, which is about a 3 dB improvement 
of noise power compared with a conventional 
Ge APD. 

Figure 29 shows the accelerated aging test 
results at 150°C. The diodes were operated 
under the breakdown condition at 100 uA of re- 
verse current. The dark current at 90 percent of 
the breakdown voltage, which is the most 
sensitive parameter for aging, maintains a 
constant value for 30 samples after 3000h 
of aging. 

InGaAs APDs are recognized as a promising 
candidate for the detector of multi-giga bit 
lightwave systems. Work is being continued to 
improve the response characteristics”! )»??) 


4. Conclusion 

Due to the expansion of optical fiber com- 
munication systems, demands for optical devices 
have been increasing rapidly. 

Fujitsu has been developing many products 
to meet these demands. The products include 
DFB LDs and InGaAs APDs for high-speed, 
long-haul systems and LEDs for optical LAN 
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systems. 

These devices offer not only good electrical 
and optical characteristics but also high reli- 
ability. 
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This report describes the state-of-the-art Fujitsu microwave semiconductors. The important 
parameters of GaAs power FETs are efficiency, linearity and reliability. The design phi- 
losophy for these parameters and performance are discussed. Low noise performance of 
HEMTs has been demonstrated and a new 1/4 um gate HEMT has been developed having a 
0.58 dB noise figure and 12.35 dB of associated gain at 12 GHz. MMICs have high potential 
for wide band, small size, and lightweight equipment. GaAs FET modules and amplifiers are 


also described as examples of actual applications. 


1. Introduction 

Microwaves are used in telecommunication, 
radar, and in other systems. In telecommunica- 
tion applications, coaxial cable systems and 
optical fiber systems are widely used as well as 
microwave radio. People involved with these 
systems are concerned about the cost per unit 
of data transferred over a specified distance. 
Optical communication systems are improving 
in terms of data transmission cost. 

However, microwave systems still have 
many advantages in several applications. Satellite 
and mobile communication are good examples. 
Satellite communication systems including 
broadcasting delivers information to many 
terminals located over a wide area at the same 
time. Radio frequency can only be used in 
portable terminals, but radar and other elec- 
tronic warfare equipment are designed to fully 
utilize microwaves. 

High bit rate digital radio systems like the 
256 QAM can compete with optical systems 
in many cases. It has long been desired to 
replace microwave electron tubes with solid 
state devices because of the high speed and 
controllability of solid state devices. The GaAs 
power FET, first reported in 1973”, has enabled 
realization of an amplifier to replace the Travel- 
ling Wave Tube Amplifier (TWTA). Low noise 
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HEMTs with excellent performance are now 
being made. These HEMTs will help in system 
design. 

Solid state equipment of small size and 
weight will become more and more important 
in the future. Microwave Monolithic Integrated 
Circuits (MMICs) will make an impact in this 
area. 

Although much equipment still needs 
electron tubes, microwave semiconductors are 
surpassing their performance and will replace 
them in the future. Furthermore, new applica- 
tions will be realized using such devices. 

In this paper, the design, fabrication tech- 
nology and performance of GaAs power FETs, 
low noise HEMTs, and MMICs are described. 
Modules and amplifiers are shown as examples 
of their applications. 


2. GaAs power FETs 
2.1 GaAs FET design 

Galium Arsenide (GaAs) is a very good 
material for microwave devices because it has 
a high electron mobility and can easily be used 
to make a high-resistance layer or substrate. 
A field effect transistor using this material 
was first proposed in 1966”). 

Figure | shows the structure of a GaAs 
FET and its small signal equivalent circuits. 
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Cz 7-buffer layer 


i-substrate 


Fig. 1—Structure of GaAs FET and its small signal 
equivalent circuits. 


Drain current 


Reverse gate 
current 


Vos1 Vos2 


Drain-source voltage 


Fig. 2—DC characteristics and large signal operation 
of a GaAs FET. 


To obtain high gain at microwave frequencies, 
a short gate length is required. For a low-noise 
small-signal FET, the gate length is the only 
major parameter. On the other hand, many 
things must be taken into consideration for 
a power FET. 

Figure 2 shows the DC characteristics and 
large signal operation of a power FET. In 
Fig.2, A is the bias point and the optimum 
load line (B-C) is obtained for the maximum 
output power. When the input signal increases, 
the output signal swing reaches the regions of 
high current at low voltage (B) and low current 
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at high voltage (C) of the load line. 

After reaching B and C, the forward and 
reverse gate current start to flow and the drain 
current and voltage clipped. The maximum — 
linear output power, Pimax, is given by the 
following equation. 

Pimax = 1/8Upsi —/ps2) x (Vps2 — Vps1). 

The maximum saturated power, Pat, is 
derived by Fourier expansion of the distorted 
waveform (finally a rectangular form), and is 
calculated to be 2 dB higher than P)max. The 
maximum channel current (/psj ), knee voltage 
(Vpsi), break down voltage (Vps2), and 
leakage current (/ps2), depend on the carrier 
density and the thickness and structure of the 
channel layer. 

Generally, leakage current (Jps2) and 
knee voltage (Vps ;) conflict with each other 
and must be optimized. Fujitsu power GaAs 
FETs use a recessed gate structure which is 
optimized for high drain-source breakdown 
voltage and high gate-drain breakdown voltage. 
This optimization enables high voltage (10 V) 
operation” : 

To obtain high output power, we must 
increase Vps; . However, the channel strucrure is 
optimized for Vps; and Jps2 as discussed 
above, and J/ps; per gate width is set to 
300 mA/mm. Therefore, the gate width must 
be made wider to increase Jpg . 

A GaAs FET has three terminals on one 
surface as shown in Fig. 1. In order to make a 
wider gate, a cross over structure is required. 
Fujitsu has used an interdigitated pattern which 
includes an insulator or air gap cross over. 
In this pattern, the unit gate width (= finger) 
length) and gate-gate separation are optimized. 
The unit gate width depends on the operational 
frequency. For example, the gate width is 50 um 
for devices designed for Ku-band operation and 
240 um for L- or S-band operation. 

For high power GaAs FETs, thermal design 
is also important. This is because the reliability 
of GaAs FETs is primarily depends on the 
channel temperature. This is the same for 
other semiconductor devices. The thermal 
conductivity of GaAs is 0.46 W/m/°C and is 
one third that of silicon. 
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The thermal resistance of a GaAs FET 
having an interdigitated pattern is expressed by 
the following formula. 


R = 10° x | I lo (34 
th Fy 46 ln a) “2 —atd 
uy 1 
(n—1).b Fa —T 
or nb (2t +1) 
oe Site Hin — 1b ard! 


where a: gate length (um) 

: gate-gate-gate separation (um) 
unit gate width (um) 

number of gate fingers 

: thickness of GaAs (um). 

It is obvious that the thermal resistance can 
be significantly reduced by reducing the thick- 
ness of the GaAs substrate. Of next importance 
is the gate-gate separation. Fujitsu GaAs power 
FETs used a “Plated heat sink” structure in 
which the GaAs substrate is lapped and etched 
to a 25 um thickness and a thick (35 ym) gold 
heat sink is formed. The total thermal resistance 
includes that of the FET chip itself and it’s 
package. Since a GaAs FET chip has an insulated 
substrate, the FET chip can be mounted directly 
on the copper heat sink of the package. 

This is much easier than for a silicon bipolar 
power transistor as it has a collector substrate 
and must be mounted on an insulated substrate 
such as beryllium oxide (BeO). The other advan- 
tage of a GaAs FET’s thermal properties is that 
the thermal coefficient of the drain current is 
negative as in other field effect transistors. 
Therefore, thermal run away is suppressed. 


“2 ™S 


2.2 Fabrication technology 

To realize a device based on the design men- 
tioned above, the following wafer process 
technology is used. 

First, a high resistive buffer layer and n-type 
active layer are grown sequentially on a Cr 
doped semi-insulating GaAs substrate by vapor- 
phase epitaxy. After mesa formation, source and 
drain ohmic contacts are made using alloyed 
Au/Ge or Au/Ge/Ni. 

Next, a recessed channel is formed between 
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Fig. 3—Structure of Fujitsu GaAs power FET. 


the source and drain electrodes and an alumi- 
num gate is evaporated on the recessed channel. 
In Ku- and Ka-band devices having a very short 
gate length (about 0.25 um), a mushroom 
shaped multi-metal layer (WSi/Ti/Au) structure 
is applied to reduce gate resistance and prevent 
the metal voiding discovered in long term life 
tests of aluminum gate FETs”. These processes 
use an electron beam exposure technology. A 
silicon nitride passivation film is applied on the 
channel region and electrodes. A multi-layer of 
Ti/Pt/Au is then formed for source and drain 
electrodes which require low current density and 
bonding pads. 

After finishing the top surface and probing 
of each chip, the GaAs substrate is lapped and 
chemically etched to obtain a 25 um thickness. 
Source contact is made on the back side of the 
wafer via through holes, and a heat sink of 
plated gold is formed. The structure of the 
Fujitsu. GaAs power FET is shown in Fig. 3. 
Chip separation is performed by chemical etch- 
ing. The individual chips are 100 percent visually 
inspected and are mounted in a package. 

High power FETs that deliver more than 
three watts have internal matching circuits in 
the package. These circuits make them easy to 
use and yield a high gain, high power and broad 
bandwidth. Figure 4 shows the internal view of 
a 6 GHz 18-watt device (FLM5964-14D). 

This device uses two chips in parallel and the 
total gate width is 57.6 mm. The drain and gate 


361 


K. Ohta et al.: Microwave Semiconductor Devices 


Fig. 4—Internal view of FLM5964-14D. 


patterns of the two chips are connected together 
to prevent low frequency oscillation. The pack- 
age used is designed for stable operation at high 
frequencies. It has RF feedthroughs made using 
alumina ceramics in a metal wall. 

The matching circuits consist of several 
materials selected for optimum power output, 
gain, and bandwidth. 


2.3 Performance of GaAs power FETs 

Of the many parameters of a GaAs power 
FET, efficiency is considered to be the most 
important one. 

When GaAs power FETs are used in a 
satellite transponder, DC power is limited by 
the solar cell system. A high efficiency also helps 
maintain a low device channel temperature and 
so prolongs the device life time. 

Figure 5 shows the output power Poy; and 
power added efficiency Nagq of the FLM3742-10 
as a function of the input power P;,. A power 
added efficiency of over 50 percent is obtained. 

For terrestrial communication, a high trans- 
mission capacity is required to reduce costs. A 
64 QAM digital radio system has been developed 
in many companies to compete with optical 
fiber transmission systems. A high capacity 256 
QAM system is currently being investigated for 
the next generation system. 

In this application, intermodulation distor- 
tion is the critical parameter to obtain a low bit 
error rate. Figure 6 shows the third and fifth 
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Fig. 5—Piy vs. Pout and power added efficiency of 
FLM3742-10. 
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Fig. 6—IM, and IMz of FLM5964-14D. 


order intermodulation distortion (IM3 and IM.) 
of the FLM5964-14D. An IM, of —45 dBc and 
an IMs of —70dBc were obtained at Poy: = 
31.5 dBm for a single carrier level using a two 
tone test. 

The other application of GaAs power FETs 
is for pulse operation such as in a radar system. 
A GaAs FET amplifier can operate with very 
fast pulse switching. Figure 7 shows the pulse 
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Fig. 7—Pulse power performance of FLM5359-20P. 
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Fig. 8—Output power of a 0.25 um power FET. 


performance of the FLM5359-20P. Its power 
and gain are much improved compared to CW 
operation. This improvement results from the 
lower channel temperature of pulse operation. 
A device for higher frequency application was 
also developed. The output power at 18 GHz 
and 38 GHz are shown in Fig. 8. The FET has 
a WSi/Ti/Au gate with dimensions of 0.25 um 
x 600 pm. 
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Fig. 9—Top view of 0.25 um gate HEMT. 


Fig. 10—Cross sectional view of 0.25 um gate HEMT. 


3. Low noise HEMTs 
3.1 HEMT design and fabrication technology 

A low noise performance HEMT was first 
reported in 1983°). Since then, there have been 
many important technical improvements in the 
quality and structure of molecular-beam-epi- 
taxial growth (MBE)°? and in the precise control 
of the process”). Based on these improvements, 
a new low-noise HEMT was developed in 1988 
using an optimized MBE wafer and a quarter 
micron mushroom gate structure. 

Figure 9 shows the top view and Fig. 10 
shows the cross-sectional view of the newly 
developed 200 um _ gate width HEMT. The 
quality of the Two Dimensional Electron Gas 
(2-DEG) in the intrinsic GaAs side at the inter- 
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face of the n* doped GaAlAs/i-GaAs is one of 
the key parameters for low noise performance. 
Using this improved MBE technology, we fabri- 
cated an n* AlGaAs layer having a controlled 
doping level and thickness and a pure intrinsic- 
GaAs layer having a sharp transition at the 
interface. 


80 


70 


m (mS) 


D 
& 


50 


0 0.2 0.4 0.6 
Ly (um) 


Fig. 11—Effects of gate length on Cg, and gy). 
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These layers were also optimized to reduce 
the short channel effect. (This becomes serious 
as the gate length is reduced.) A quarter micron 
gate provides a lower gate-capacitance (C,,) and 
high transconductance (gm); these are the two 
most important device parameters for a low 
noise device. The quarter micron gate is fabri- 
cated using electron beam lithography (EBL) 
and has good reproducibility. A mushroom 
shaped gate was fabricated using a double layer 
of Au and WSi. The WSi forms the Schottky 
interface and the Au reduces the gate resistance 
(Rg) as shown in Fig. 10. The effects of gate 
length on Cy, and gm are shown in Fig. 11. The 
effects of C,, and gm on the noise figure and 
associated gain are shown in Fig. 12. 


3.2 Performance of low noise HEMTs 

Figure 13 shows the minimum noise figure 
(Fmin) and associated gain (G,,) vs. drain current 
(Ups) at 12 GHz of a quarter micron HEMT 
compared to those of a half micron HEMT. 

Figure 14 shows Fypjn and Ga, vs. frequency 
(f) at Ip5 = 10 mA, and Vpg = 2 V of a quarter 
micron HEMT. The optimized Fyjjin was 0.58 dB 
at 12 GHz and 1.0 dB at 20 GHz. The associated 
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Fig. 12—Effects of gm and Cg, on noise figure and associated gain. 
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Fig. 13—Noise firgure and associated gain vs. drain 
current. 


nw 


10 


(dB) 
Associated gain G,, (dB) 


Optimum noise figure F, 


10 15 20 
Frequency f (GHz) 


Fig. 14—F, and Ga, vs. frequency. 


Gas is 12.35 dB at 12 GHz and 9.0 dB at 20 GHz. 
Remarkable improvements in C,; and gm were 
confirmed for a quarter micron gate HEMT 
while R, was kept low as a result of the equiv- 


alent circuit analysis. 
No remarkable short channel effect was 


observed in the DC characteristics of the quarter 
micron HEMT due to the epitaxial layer opti- 
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Fig. 15—G,, vs. Fo for quarter micron and half micron 
HEMT. 


mized by MBE. 

One method of analyzing the noise charac- 
teristics is to consider the noise power (P,) 
generated in the device®), Figure 15 shows 
Gas VS. Fin Of a quarter micron and half micron 
HEMT at various values of P, and frequency. 

Since the extrinsic regions are almost the 
same for a half micron and quarter micron 
HEMT, the difference in P, between the two is 
clearly due to the intrinsic regions. In conclu- 
sion, reducing the gate length of a HEMT results 
in an improvement in the noise figure and 
associated gain. 

Further improvement can be expected from 
a shorter gate device (e.g. 0.1 um gate width) 
together with precise control of the wafer 
process and an optimized epitaxial layer. 


4. GaAs MMICs 
4.1 Basic concept 

Microwave Monolithic Integrated Circuits 
(MMICs) are now in demand for microwave 
communication systems and electronic war- 
fare (E/W). The process technology, electrical 
characteristics, and uniformity of GaAs FETs 
have greatly improved, and MMICs are now 
feasible. 

In MMICs, several different components 
(such as FETs, resistors, capacitors and induc- 
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RF in a Voc 


wh 


Fig. 16—Schematic circuit diagram of FMM021806XV. 


Fig. 17—Input circuit of FMM021 806XV. 


tors) are fabricated on one chip. The use of a 
GaAs substrate has many advantages compared 
to a Si substrate because of its high bulk re- 
sistivity and high electron mobility. This results 
in less transmission loss, less leakage current 
and a higher cut-off frequency than convention- 
al FETs. 

GaAs MMICs have the following advantages 
over hybrid ICs: 

1) Reduction of system size and weight 

2) Improvement of broadband performance 

3) Improvement of high-frequency performance 
4) Enhanced inherent reliability 

5) Uniform electrical characteristics. 

Several types of MMICs have already been 
developed at Fujitsu including wide-band distrib- 
uted amplifier chips, 14 GHz and 20 GHz gain 
block amplifiers and prescalers. This report 
introduces the FMM021806XV __ distributed 
amplifier developed for wide-band operation 
from 2 GHz to 18 GHz. The distributed ampli- 
fier (travelling wave amplifier) is used as a wide- 
band MMIC for the following reasons: 

1) Ultra wide-band characteristics 

2) Less critical in scattering of parameters 

3) Easy to monolithically fabricate on GaAs 
substrate. 

Article 2) is especially superior to other 
types of amplifiers such as reflection or feed- 
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Fig. 18—Top view of FMM021806XV 
(chip size: 15 x 2.5 mm). 


back type amplifiers. 


4.2 Circuit design 

Figure 16 shows the schematic circuit dia- 
gram of the FMM021806XV chip. The input 
circuit forms a low-pass filter as shown in 
Fig. 17. The cut-off frequency mainly depends 
on the length of the transmission line and gate- 
to-source capacitance C,, of the unit FET. 
If the cut-off frequency is set sufficiently high 
and the characteristic impedance is set to 
50 ohms, the return loss of the input port is 
very low over a wide frequency range. A similar 
situation exists for the output port. 

This is the essential point of the distributed 
amplifier. When designing a distributed ampli- 
fier, the key points are as follows: 

1) Unit FET size, which determines Cg, 

2) The number of FET linkages, which is deter- 
mined by the target gain and gain flatness 

3) Phase matching of FET operation 

4) Cut-off frequency of the FET and in/out 
signal lines 

5) Circuit structure of termination including 
the biasing circuit. 

These points are optimized by using micro- 
wave circuit simulation software. The equivalent 
circuit of a unit FET is determined by the 
S-parameter for up to 26.5 GHz. The terminat- 
ing circuit and in/out transmission circuit are 
optimized separately. Then, the total circuit is 
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optimized. Pattern layout is optimized so that 
the elements do not interfere with each other. 
In and out ports of the RF signal are placed at 
opposite sides of the longer line for cascadeabili- 
ty. Figure 18 shows the top view of the chip. 


4.3 Fabrication technique 
1)” FET 

The important requirements for MMIC 
fabrication are the uniformity and reliability of 
the elements. The active region of the FET is 
formed by selective ion-implantation for uni- 
formity. 

Dispersion of the saturated drain current is 
about 10 percent or less. The Shottky gate metal 
is refractory WSi covered by Ti/Au to achieve 
high reliability. The built-in potential is 0.70 V 
+ 0.02 V and the ideal factor is 1.10 to 1.20. 
Figure 19 shows the FET channel structure. 
Offset n* implantation techniques are used to 
reduce the source resistance and to increase 
output resistance. The source electrode is 
grounded by a through hole to reduce source 
inductance and to increase the degree of free- 
dom for pattern layout. 

2) Resistor 

A resistor is formed using the n* implanta- 
tion region. The sheet resistance is 200-210 92/0. 
Dispersion of the sheet resistance in a wafer is 
less than one percent. 

3) Capacitor 

The capacitor has a Metal-Insulator-Metal 
(MIM) structure. The insulator is amorphous 
SiN and its thickness is set to 150 nm. 

4) Other elements 

The transmission line is fabricated using 
multi-layer metallization on a GaAs substrate. 
The air bridge technique is used to connect 
overlay metals. 


4.4 Electrical characteristic 

Figure 20 shows the return loss and gain 
of the FMM021806XV. In/out return loss of 
less than —10dB is achieved over the entire 
frequency band from 2 GHz to 18 GHz. This 
means that the FMM021806XV can be cascaded 
directly. The small-signal gain of two-stage 
cascaded FMMO021806XV chips was measured 


FUJITSU Sci. Tech. J.,24, 4, (December 1988) 


K. Ohta et al.: Microwave Semiconductor Devices 


Source Gate Drain 


WSi/Ti/Au 


Ohmic metal 


n-implanted layer 


nt-implanted layer 


Fig. 19—Structure of FETs used in a MMIC. 
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Fig. 20— Return loss and gain of FMM021806SV. 
to be 9 dB. This is twice the gain of a single 
chip. A gain flatness of less than 1 dB was 


obtained. Figure 21 shows the power output 
and noise figure. A P; gn of 20 dBm was achieved 
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Fig. 21—Power output and noise figure vs. frequency 
of FMM021806XV. 
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from 6 GHz to 18 GHz using highly doped FETs. 
MMICs of a distributed amplifier having a 
lower noise figure, wider band, and higher power 
are now under development. 
Furthermore, MMICs having other functions, 
such asa phase shifter and a switch will be devel- 
oped. 


5. GaAs FET modules and amplifiers 
5.1 GaAs FET modules 

GaAs FET modules have been developed for 
a new generation of microwave amplifiers that 
overcome the disadvantages of conventional 
amplifiers (e.g. bandwidth, size and development 
time). 

Fujitsu GaAs FET modules were designed 
using the technologies described below. 

1) Impedance matching (RF circuit) 

The load conditions necessary to obtain 
maximum gain for small signals differ from 
those required for large signals. The impedance 
matching circuits are designed by CAD using 
small signal and large signal S parameters. 
Figure 22 shows the fundamental single stage 
circuit. 

2) DC bias circuit 

In the design of a microwave transistor 
amplifier, transistor operation must be stable 
in the range from DC to the maximum oscilla- 
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Fig. 23— Unstable regions of GaAs FETs. 
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Fig. 24—Internal view of 23 GHz amplifier module. 


tion frequency. Figure 23 shows the unstable 
regions on a Smith Chart calculated using S 
parameters. The unstable region is larger at 
lower frequencies. 

The impedances at lower frequencies deter- 
mined by the bias circuit in the amplifier mod- 
ules do not enter the unstable regions and 
always offer stable performance at the operat- 
ing conditions. 

3) Structure design 

Figure 24 shows the inside view of the 
23 GHz amplifier module. Microwave amplifiers 
are formed in hermetically sealed metal-ceramic 
package integrated circuits having a DC bias cir- 
cuit. RF terminals are positioned at the center 
of the package and DC bias terminals are placed 
at the sides. Not only the amplifier but a varia- 
ble attenuator or other type of circuit can also 
be installed in the package. Using these modules, 
a designer can make an amplifier in a very short 
turnaround time. 


5.2 Amplifiers 

Since the development of the microwave 
radio system, there has been a demand for high 
reliability, small size, and low power consump- 
tion equipment. In pursuit of this goal, electron 
tubes have been replaced by microwave semicon- 
ductor devices. The introduction of the world’s 
first high power GaAs FET from Fujitsu in 1973 
have enabled realization of all-solid-state micro- 
wave radio systems. 
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Fig. 25—Four-stage 6 GHz 1-watt amplifier. 


Fig. 26—4-8 GHz 5-watt amplifier. 


In 1975, the first 6 GHz 1-watt high power 
multi-stage GaAs FET amplifier available for FM 
multiple radio systems was developed. This 
amplifier exhibited excellent signal transmission 
characteristics). This led to the replacement of 
the Travelling Wave Tube in many microwave 
application systems!®). This amplifier was fabri- 
cated on a_ Teflon-glass-fiber printed-board 
having micro-strip circuits. It is shown in Fig. 25. 

Microwave amplifiers of microwave multiple 
radio systems still had to be improved interms of 
output power, linearity, efficiency, and band 
characteristics. 

To realize these requirements, Fujitsu made 
an amplifier using GaAs FET chips and MIC 
technology. Figure 26 shows a 4-8 GHz 5-watt 
power amplifier!”. However, this structure was 
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Fig. 27—4 GHz 5-watt amplifier. 


Fig. 28—14 GHz 1-watt amplifier. 


difficult to repair because it was sealed as a total 
amplifier. 

Therefore, we have developed a new genera- 
tion microwave GaAs FET amplifier. It consists 
of a driver stage amplifier using GaAs FET mod- 
ules and a booster stage amplifier using internal- 
ly matched GaAs power FETs. Many kinds of 
amplifiers have been developed having high relia- 
bility, small size, low power consumption, low 
cost, and good maintainability in the 2-23 GHz 
range. Figure 27 shows a 4 GHz 5-watt amplifier 
and Fig. 28 shows a 14 GHz 1-watt amplifier, 
both having the new design mentioned above. 
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6. Conclusion 

Fujitsu has developed GaAs power FETs, 
low noise HEMTs, broad band MMICs, and mod- 
ules for microwave equipment. Further improve- 
ments in performance and cost will make system 
design even more simple and practical. 
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Characterization of Compound 
Semiconductor Materials by Transmission 
and Reflection Electron Microscopy 


®@ Itsuo Umebu (Manuscript received June 29, 1988) 


The interface structures of the GaAs/AlAs superlattice were analyzed by Transmission 
Electron Microscopy (TEM) with the help of computer simulation. The bright spot arrays 
which appear on the uppermost AlAs layer are shown to be a good indicator for this 
analysis. Atomic layer steps occur at intervals of 3-10 nm. The surfaces of MBE-grown 
GaAs layers were analyzed in detail with reflection electron microscopy (REM). The surfaces 
consist of undulations and small steps. Anisotropic surface roughness may be due to an- 
isotropic Ga surface diffusion. Atomic ordering in InGaP mixed crystals was analyzed by 


cross-section TEM, and a crystal model with double periodicity is proposed. 


1. Introduction 

The trend toward miniaturization in micro- 
electronics demands that devices and materials 
be analyzed at the atomic level. Transmission 
Electron Microscopy (TEM) and Reflection 
Electron Microscopy (REM) are the only 
methods for such analysis that provide images 
with atomic-order resolution as well as informa- 
tion on crystal periodicity. 

This paper discusses recent topics on high- 
resolution TEM and REM used to observe 
the cross sections and surfaces of compound 
semiconductor materials. Chapter 2 deals with 
the supercomputer simulation of TEM images 
for GaAs/AIAs superlattices, and then analyzes 
the atomic structures of GaAs/AIAs interfaces 
grown by Molecular Beam Epitaxy (MBE) 
with the help of the simulation. Chapter 3 
gives an analysis of the surface of MBE-grown 
GaAs by REM. This way of observation is shown 
to be very sensitive to surface roughness. The 
“natural superlattice’? generated spontaneously 
during the growth of mixed crystals has been 
the focus of interest for researchers because it 
may degrade the characteristics of the materials 
or lead to the development of new materials. 
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Chapter 4 shows a cross-sectional image of the 
natural superlattice and a proposed crystal 
model. 


2. GaAs/AlAs superlattice heterointerface 
2.1 Computer simulation of the lattice image” 

High resolution TEM images are produced 
by the interference of electron waves with their 
phases modulated by the potential in the 
sample. The images are drastically altered by 
such observation conditions as sample thickness 
and defocus of the object lens. Simulations 
of various observation conditions made in 
advance are a great help in understanding 
and analyzing TEM images. Simulations are 
indispensable when observing new materials 
or new structures for the first time. 

TEM images were calculated by the Cowley- 
Moodie multislice method” with a FACOM VP- 
400 supercomputer. The parameters of the 
electron microscope used daily were taken in 
the calculation: an accelerating voltage of 
200 kV, a spherical aberration coefficient of 
0.4mm, a chromatic aberration coefficient of 
0.8 mm, and a beam divergence of 0.6 mrad. 
This gives a point-to-point resolution of 0.18 nm. 
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16.0 nm 


Fig. 1—Simulated GaAs/AlAs TEM images for 6.4-nm, 11.2-nm and 16.0-nm-thick samples. Bright spot 
arrays are visible for the uppermost AlAs layer for 6.4-nm-thick sample. 


Figure | shows the — sample-thickness 
dependence of lattice images of a (110) cross 
section of 9-monolayer GaAs/9-monolayer AlAs 
superlattices calculated at an optimum defocus 
of 38 nm. Comparing the input crystal model 
and output images shows that the dark and 
bright bands correspond to GaAs and AIAs. 
Each spot corresponds to a Ga and As pair, or 
an Al and As pair, since the resolution of 
0.18 nm is bigger than the projected distance 
of 0.14 nm onto the (110) observation plane 
between the Ga and As or between the Al and 
As atoms. There are arrays of bright spots on 
the uppermost AlAs layer of at a sample thick- 
ness of 6.4 nm but not at 11.2 nm and 16.0 nm. 
The bright spots were visible in the range of 
sample thicknesses from 6nm to 10 nm. The 
bright spot arrays were found to be a good 
indicator for the location and configuration of 
the heterointerfaces. The bright spot arrays 
run on only one side of the two interfaces 
because the arrangement of Al atoms is different 
at the two interfaces. When observed from the 
perpendicular direction, the bright spot arrays 
can be seen only on the bottom of the AlAs 
layers. 

Figure 2 shows a lattice image of the hetero- 
structure with a mixture of monolayer steps 
(height: 0.28 nm) whose fronts are perpendicu- 
lar and parallel to the direction of the electron 
beam. Doubling of the bright spot arrays can be 
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GaAs 


~ 4 


Fig. 2—A simulated TEM image and a GaAs/AlAs model 
with two types of atomic layer steps: fg = 6.4 nm, 
typ =4.8 nm. 


seen at the left on the uppermost AlAs layer 
and the layer beneath it. The step-shaped arrays 
of bright spots can be seen at three edges. This 
indicates that even a single monolayer fluctua- 
tion at the heterointerface can be clearly 
reflected onto the configuration of the bright 
spot arrays. 

Thus with the help of these kinds of simula- 
tion, unknown atomic structures at the interface 
can be estimated. 


2.2 TEM observation of GaAs/AIAs superlattices 
gorwn by MBE” 

GaAs/AIAs superlattices with a period of 
nine GaAs monolayers and nine AlAs mono- 
layers were grown on (001) GaAs substrates 
at 500 °C and 700 °C by MBE. The transmission 
electron microscope used had the same specifica- 
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Fig. 3—Lattice image of a GaAs/AIlAs superlattice grown 
by MBE at 700 °C. The bright spot arrays shown 
by arrows are visible at regions A and C but not 
at region B. 


tions as in the previous section. 

Figure 3 shows a cross-sectional TEM image 
for the sample grown at 700 °C. All features 
of the image coincide with the computer 
simulation. The dark bands correspond to GaAs 
layers and the bright bands correspond to AlAs 
layers. Ga and As, or Al and As are visible as 
pairs due to insufficient resolution. Very bright 
spot arrays indicated by arrows in the figure 
run on only one side of the two AlAs/GaAs 
and GaAs/AIAs interfaces. The thickness of 
the sample increases from bottom to top. 
The bottom is the thinnest region where the 
very bright spots can be seen. After appearing 
in the thinnest region, they disappear and 
again appear with thickness. 
Doublings and steps are seen in the bright 
spot arrays, which suggests the existence of 
atomic layer steps at the heterointerface. The 
superlattice grown at 700 °C was estimated to 
include steps at the interface at intervals of 
less than 3 nm and the superlattice grown at 
500°C was estimated to include steps at 
intervals exceeding 10 nm. The difference in 
the two samples appeared as the difference 


increasing 
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Fig. 4—Images taken by a) SEM, and b) REM for a GaAs 
surface grown by MBE at 600 aC: 


in the transition layer thickness at the hetero- 
interface through Raman _ spectroscopy and 
small-angle X-ray reflectivity analysis”. 

The different lattice images of superlattices 
grown at 700 °C and 500 °C at the same thick- 
ness were compared using the fact that the 
bright spot arrays are visible at limited sample 
thicknesses. This should be kept in mind when 
comparing two images, because thick samples 
include more defects than thin samples. 


3. Surface roughness of MBE-grown GaAs 

analyzed with REM 

In the REM observations, the sample surface 
is set close to and almost parallel to the incident 
beam in a transmission electron microscope. 
The beam reflected by the surface forms the 
image. As expected from its geometry, REM 
is very sensitive to surface structures such 
as surface defects, including atomic steps and 
dislocations. The only drawback is_ the 
foreshortening effect caused by the grazing 
electron beam, which severely limits the resolu- 
tion. Images of a GaAs surface grown by MBE 
taken by a Scanning Electron Microscope 
(SEM) and by REM are compared in Fig. 4. 
No surface features are visible in the SEM, 
but a fine structure is revealed by REM. REM 
was applied to analyze the surface roughness 
of MBE-grown GaAs. This is the first detailed 
analysis of it by REM”. 

GaAs was grown by MBE on a (001) GaAs 
substrate at 700 °C. After a 300-nm-thick layer 
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[110] 


Intensity 


Fig. 5—REM images of a GaAs surface grown at 700 °C by MBE and intensity change between two CAs. 
They are in focus at F. CA and CB are two typical patterns of contrast. Electron beam direction 


a) [110] and b) [110]. 


was deposited (growth rate: | uwm/h), the sub- 
strate was maintained at 700°C for three 
minutes under an arsenic flux to planerize the 
surface. The sample was removed from the 
growth chamber and examined with a trans- 
mission electron microscope with no treatment 
between the growth and the observation. 

Figure 5 shows the surfaces observed from 
the two perpendicular directions, [110] and 
[110]. The two images give a considerably 
different impression, deriving from the differ- 
ence in form and contrast of fine contrasts. 
There are two types of contrast, CA and CB, 
whose principles of image formation differ. 

Contrast CA exists on the sides of a medium 
contrast and appears in two ways, one is bright 
and the other is dark and broad. The variation 
of the contrast between these two CAs is shown 
in Fig. 5. Surface undulation is estimated to 
cause changes in the intensity of the reflected 
beam, which is an origin of contrast CA. As 
shown in Fig.6, when the electron beam is 
irradiated at a greater angle than the maximum 
slope of the undulation, the electron density 
of the reflected beam increases for an upward 
staircase and decreases for downward staircase, 
which causes the contrast to be bright-medium- 
dark, as obtained in the experiment. By lowering 
the incident angle, the contrast was observed 
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Fig. 6—A schematic drawing of surface undulation and 
electron beams. When the angle of the incident 
beam is greater than that of the slope, the 
contrast is bright-medium-dark (B-M-D). 


to change to dark-medium-dark as predicted 
by the undulation model. 

The following conclusions were drawn 
from the surface roughness obtained from 
contrast CA: 

1) the surface undulation is a collection of 
plateaus, 

2) the flat part of the plateaus is 0.1-2.0 um, 

3) these plateaus are less than 3 nm high, and 

4) the slope at the plateau borders is between 

12 mrad and 20 mrad. 

The contrast CBs are seen in the medium 
contrast between two CAs. They are fine and 
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Fig. 7—Diffraction pattern for an InGaP mixed crystal. 
Extra spots with streaks are visible. 


bright or dark, which corresponds to mono- 
atomic level steps shown on the top and bottom 
of the undulation in Fig. 6. They are straight 
and parallel to [110] in Fig. 5a) and zigzag but 
nearly in the [110] direction in Fig. 5-b). They 
are 20-150 nm long. It is thought that the 
phenomenon is caused by anisotropic Ga surface 
migration in two directions. 


4. InGaP natural superlattice*? 

In TEM images of III-V mixed crystals, 
periodic structures other than zincblende struc- 
ture are often seen. A major topic is the atomic- 
ordering structure, often called a ‘‘natural super- 
lattice” after the way it is formed. 

Figure 7 shows diffraction pattern for an 
InGaP mixed crystal grown on a (001) GaAs 
substrate by atmospheric-pressure Metal Orgnic 
Chemical Vapor Deposition (MOCVD). Extra 
spots with streaks are visible. The characteristic 
features of these extra spots are that they appear 
as pairs at positions indexed as (h+1/2 k-1/2 
1841/2) and (h—1/2 k+1/2 S“—1/2) for an 
hkl matrix spot but do not appear in another 
diagonal direction. An example around the 000 
spot is indicated by arrows. The streaks extend 
toward the [001] and [001] directions and tilt 
a few degrees toward the [110] and [110]. 
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Fig. 8—A (110) cross section of the InGaP mixed crystal. 
xjs and yjs show atomic ordering. 
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Fig. 9—A proposed crystal model for the InGaP ordered 
crystal. In and Ga lie on separate planes. 


Figure 8 shows a corresponding lattice image for 
the diffraction pattern shown in Fig. 7. Among 
the [111] lattice fringes, some bright spot 
arrays, for example the x;s and yj;s, are double 
spaced. Tracing the xjs and yjs shows them to be 
one spacing off. The bright spot arrays zigzag. 
This double periodicity creates extra spots in the 
diffraction pattern, and the streaks around the 
spots mean the domains with double periodicity 
have limited volumes. Such domains were found 
to be plate-like and to lay nearly on the (001) 
plane. Figure 9 is a proposed model to explain 
the atomic ordering. Column III atoms, In and 
Ga lying separately on alternate (111) planes 
cause the double periodicity. 

Table 1 shows combinations of materials and 
growth methods for which such ordering has 
been reported. The ordering seems to depend on 
how far each growth method lies from the 
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Table 1. Mixed crystals and growth methods for 
which atomic ordering has been reported 


~ LPE ue VPE MOCVD MBE as 
AlGaAs | Ref. 8 Ref. 8 
GaAsSb Ref. 9 Ref. 11 
InGaAs | Ref. 6 | Ref. 7 
InGaP- | Ref, 5 7 
InAlAs Ref.10 | 


LPE: Liquid-phase epitaxy VPE: vapor-phase epitaxy 
MOCVD: metal-organic chemical vapor deposition 
MBE: molecular beam epitaxy 


equilibrium state. 

The relationship between the size and 
density of the ordered domains and growth 
conditions is now being studied by the authors, 
along with the effect of ordering on material 
characteristics. 


5. Conclusion 

Recent studies on the interface structure of 
GaAs/AIAs superlattices, the surface roughness 
of MBE-grown GaAs, and atomic ordering in 
mixed crystals have been reviewed. Concerning 
the interface structure, the heterointerface was 
analyzed with the help of computer simulation. 
Computer simulation is, and will continue to be, 
an indispensable tool when heterostructures are 
used frequently in advanced semiconductor 
devices. Semiconductor surfaces have been 
analyzed in ultra-high vacuum chamber after in 
situ cleaning. It was possible to observe surface 
roughness with a conventional electron micro- 
scope without any treatment for samples, that 
is, with thin native oxide layers. The ability to 
observe such practical surfaces has long been in 
demand. REM is one of the most effective 
methods for this observation. REM will reach its 
full potential when better crystals with fewer 
undulations are obtained. Natural superlattices 
may be found in many other materials. Analyses 
of how natural superlattices are formed and 
their effect on material characteristics including 
device reliability, are now important business. 

Expectations for TEM analysis are increasing 
daily. In response to this, further efforts to 
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improve TEM technology, i.e. sample prepara- 
tion, observation, image processing, analysis, and 
computer simulation, should be made persist- 
ently. 
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This paper describes an ultra high-speed bipolar process technology using en Emitter-base 


Self-aligned structure with Polysilicon Electrodes and Resistors (ESPERs). This structure, 
combined with trench isolation, drastically reduces parasitic capacitances and resistances, 
realizing a sub-40 ps ECL circuit and high-performance bipolar devices. 


1. Introduction 

Bipolar LSIs are used in main frame com- 
puters and digital communication systems. 
They play an important role in the systems 
because the system performance strongly 
depends on the switching speed of the bipolar 
devices. In order to reduce the switching speed 
and power dissipation, several sophisticated 
bipolar device and process technologies have 
been reported)”, They mainly use the so- 
called self-aligned techniques to obtain small 
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Fig. 1—Trends of basic gate delay time of ECL circuits. 
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active areas without using tight lithography 
rules. Figure 1 shows the recent improvements 
in the performance of bipolar devices. With 
the introduction of various self-aligned tech- 
nologies, bipolar devices entered a new 
generation. In this paper, a newly developed 
self-aligned optimized for  high- 
performance bipolar VLSIs and high-frequency 
ICs is described. This structure, named an 
Emitter-base Self-aligned structure with Poly- 
silicon Electrodes and Resistors (ESPERs), is 
combined with U-groove isolation with thick 
Field OXide (U-FOX) or trench isolation”. 
As a result, smaller base-collector capacitances 
and reduced collector-substrate capacitances 
have been achieved. Moreover, optimization 
of the device dimension has also been considered 
for ECL circuits. 


structure 


2. Device optimization 

The contribution of each device parameter 
to the basic gate delay time (fpq) of ECL 
circuits was investigated by circuit simulation 
for the case of self-aligned bipolar device 
structures. Figure 2 shows the results of simula- 
tion. Among the parasitic capacitances and 
resistances, the most crucial parameter that 
affects the switching time is the base-collector 
capacitance (C,p). When the switching current 
is 1 mA (the effective emitter size is 0.35 x 
10 um?), a ten-percent decrease in the C.p cor- 
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Fig. 2—Simulation results. 


responds to a four-percent reduction in the fpq. 
A ten-percent reduction in the base resistance 
(7p) or the same reduction of the collector- 
substrate capacitance (C,s,) causes a two-percent 
decrease in the fpq. On the other hand, a ten- 
percent reduction in the _ base-emitter 
capacitance corresponds to less than _ one- 
percent only. In addition to the parasitics, 
the cutoff frequency (f7) is also an important 
factor affecting the switching time. A higher 
fr can be achieved by a shallower base. But 
as the shallow base often causes a higher base 
resistance, the device structure should be 
optimized to keep a low base resistance. 

The polysilicon emitter-base  self-aligned 
structure is suitable for reducing the base- 
collector junction area. This also makes it 
possible to achieve a shallow base without 
increasing the base resistance. All of these 
properties are originated in the stacking double 
polysilicon structure for the base and emitter 
electrodes. The U-FOX, or the trench isolated 
structure with a thick field oxide layer, reduces 
the C,,, parasitic wiring-substrate and resistor- 
substrate capacitances. Accordingly, — the 
combination of ESPER with U-FOX is the best 
solution for high-performance bipolar devices. 


3. Steps in the fabrication process of ESPER 
In this chapter, the steps in the fabrication 
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Fig. 3—Schematic cross-section of an ESPER transistor 
combined with U-FOX. 


process of ESPER are described. 

1) Transistors are isolated by deep trenches 
and active regions are surrounded by a thick 
field oxide layer. This step uses the conven- 
tional U-FOX technique” 

2) An undoped polysilicon layer is deposited 
on the n-type epitaxial layer. The polysilicon 
base electrodes and resistors are formed 
in the polysilicon followed by boron ion 
implantation and CVD oxide deposition. 

3) The intrinsic base regions are opened by 
photolithography and etching of the CVD 
oxide and the polysilicon. 

4) The surfaces of the n-type epitaxial layer 
and the polysilicon are oxidized when the 
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Fig. 4—Cross-sectional SEM micrograph of an intrinsic 
and extrinsic base region. 


Va =0V, T= 300K 
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Fig. 5—Gummel plot of a 0.5 X 16 um* X 2 emitter 


transistor. 


extrinsic base regions are diffused from 

the boron doped polysilicon. 

5) The intrinsic base regions are formed by 
boron ion implantation. The emitter window 
is opened and arsenic is diffused from 
another polysilicon layer. 

Figure 3 shows the schematic cross-section 
of an ESPER.transistor combined with U-FOX. 
Figure 4 shows a cross-sectional SEM micro- 
graph of the intrinsic and extrinsic base region. 
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Fig. 6—Current gain vs. collector current for the same 
transistor as Fig. 5. 


0 5 10 
Bias (V) 


Fig. 7—Leakage current vs. voltage characteristics of 
each junction. 


4. Transistor characteristics 

By using the ESPER technique, many types 
of transistors were fabricated to check their DC 
and AC characteristics. Figure 5 shows a typical 
Gummel plot of a double-stripped 0.5 x 16-um? 
emitter transistor. This graph has nearly ideal 
I-V characteristics for both the collector and 
base currents from the low injection level. 
The current gain dependence upon the collector 
current is shown in Fig. 6. The current gain 
is nearly constant from 10 nA to 5 mA of the 
collector current. Figure 7 shows the I-V charac- 
teristics of the leakage current versus emitter- 
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Fig. 8—Cutoff frequency vs. collector current 
(0.35 x 10 zm? emitter size). 


base and collector-base reverse biased voltages 
and collector-emitter voltage with the base 
electrode open. The leakage current of: the 
E-B junction is 43 pA at the reverse bias of 
2 V. The leakage current between the collector 
and emitter is 300 pA at a bias of 5 V. These 
values are satisfactory even for very low-current 
digital circuits. 

In addition to DC characteristics, the cutoff 
frequency (f7) was also measured using the 
S-parameters. In the case of a 0.35 x 10-ym? 
emitter transistor, the peak fy of 17.2 GHz 
was obtained at the V,.. of 3 V, as shown in 
Fig. 8. 

Other types of transistors such as 4200 
parallel arrayed transistors were also fabricated. 
They show little junction leakage and have 
satisfactory DC characteristics. The device 
parameters of the ESPER transistors compared 
with those of a conventional U-FOX transistor 
are listed in Table 1”. 


5. Circuit performance 

Based on the results of circuit simulation 
in Chap. 2, the dependence of the tpq of ECL 
circuits on the C,, was investigated. By fabricat- 
ing two types of transistors with different 
extrinsic base areas for the same ECL circuit, 
the basic gate delay times were compared. 
One of the transistors had an effective extrinsic 
lateral base width of 0.54 um. That for the other 
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Table 1. ESPER-transistor parameters compared with 
conventional U-FOX transistor. 


ESPER transistor Th 
Emitter size | 0.35 x10 wm?|0.35x2.9 um?| 0.8 x2 um? 
Co 40 fF 8.2 fF TfF 
Ca, 9.5 fF 3.7 fF 11 fF 
Cor 50 fF | 12 fF 4 fF 
fio 100 100 120 
ft 17.2 GHz | 13 GHz 6 GHz 


im 2.9 um long emitter 
N\ 


Me 
ML 
SSO. 


tya (ps/G) 


10 zm long emitter ~two_ 


: Measured 


~__ : Simulated 


0 
0 0.2 0.5 1.0 


Switching current (mA) 


Fig. 9—Basic gate delay time vs. switching current. 


was 0.34yum. The total lateral base widths 
were 1.88 um and 1.48 um, respectively. A 
comparison of the circuits made by these 
transistors produced the following results. 
For high-power circuits, whose transistors 
had an effective emitter size of 0.35 x 10 um?, 
the C,p’s were 11.3 fF and 9.5 fF corresponding 
to each base area. The fpq’s were 50 ps and 
46 ps, respectively. The switching current 
was 1.0mA. For low-power circuits, whose 
transistors had an effective emitter size of 0.35 x 
2.9 um?, the C.4’s were 4.42 fF and 3.74 fF, 
while the ftpq’s were 82 ps and 75 ps. The 
switching current was 0.25 mA. In these cases, 
a thirty-percent decrease in the Cyp corresponds 
approximately to a ten-percent decrease in the 
tpa- This result corresponds fairly closely 
to the prediction of the simulation. 


By using the 0.35x10-um? emitter 
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transistors with a base width of 1.48 ym, a 
minimum basic gate delay time of 38.8 ps 
was obtained at a switching current of 1.28 mA. 
In the case of 0.35 x 2.9-um? emitter transistors 
with the same base width, 60.2 ps was obtained 
at a switching current of 0.34mA. The fpq 
versus the switching current characteristics are 
shown in Fig. 9. For these transistors, the length 
of the wiring contributed 50 ps/mm, and the 
gate loading added 7ps per fanout for a 
0.32-mA emitter follower current. This is a 
suitable situation for VLSI ECL circuits?. By 
employing the ESPER process, a 100-ps 10 000- 
gate ECL array has been put on the market®. 
The ESPER process is not only suitable for 
VLSI applications but also for high-frequency 
digital communication ICs of the giga-Hertz 
range. For instance, a 3.6-GHz preamplifier 
and a 6.3-Gbit/s D-F/F for optical repeaters have 
been reported”. 


6. Conclusion 

This paper describes the newly developed 
ultra high-speed bipolar process technology 
named ESPER. By combining ESPER with 
trench isolation, e.g. U-FOX parasitic capaci- 
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tance and resistances have been reduced drasti- 
cally. Among the parasitics, the base-collector 
capacitance is the most crucial factor affecting 
the switching speed. The base area was opti- 
mized for suitable circuit operation and power 
dissipation. By using these processes, a sub-40 ps 
ECL circuit was described, as well as high- 
performance VLSI chips and high-frequency ICs 
for giga digital systems. 
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This paper describes a high-speed BiCMOS technology which consists of bipolar process 
technology using polysilicon emitter and CMOS process technology using the 1.0 um-rule. 
The high-speed characteristics of the BiCMOS were obtained: The cutoff frequency (f+) 
of the bipolar npn transistor was found to be 6 GHz with a propagation delay time (tpq) for 
the CMOS gate of 0.5ns. The high performance of the conventional bipolar device and 


CMOS device were also maintained. 


The BiCMOS technology has been applied to fabricate a 2 000-gate gate array and a 256-Kbit 
SRAM. The results of these devices are also reported. 


1. Introduction 

BiCMOS technology is widely regarded as 
providing a possible means for realizing high- 
performance LSIs with multiple functions”). 
This is because BiCMOS technology has the high 
drivability of bipolar LSIs combined with the 
low power dissipation and high packing density 
of CMOS LSIs. BiCMOS technology is therefore 
drawing a great deal of attention from various 
fields of application. 

At Fujitsu, a digital/analog BiCMOS has 
already been developed??. At presently, high- 
speed BiCMOS technology has been developed, 
and satisfactory results have been obtained. 
This technology consists of high-speed bipolar 
technology using a polysilicon emitter which is 
based on DOPOS (Doped Polysilicon) tech- 
nology”? and high-density CMOS technology 
of the 1.0 um-rule. 

This paper describes the device structure 
and the process technology, and discusses a 
high-speed 2000-gate gate array and a_ high- 
speed 256-Kbit SRAM as examples of applica- 
tion. 
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2. Device structure and fabrication process steps 

The structure of the BiCMOS device is 
shown in Fig. 1. 

Photolithography of the 1.0 um-rule is used. 
In making the bipolar transistor, a polysilicon 
emitter is formed by using improved DOPOS 
technology, realising a minimum emitter size 
of 1.1.x 1.9 m?. In the MOS transistor, the 
gate oxide is 25nm. The gate length in the 
SRAM is 1.2 um for nMOS and 1.4 um for 
pMOS, and are 1.5 um and 1.8 um in the gate 
array. 

In order to reduce the hot electron effect, 
an LDD (Lightly Doped Drain) structure is 
applied for nMOS in SRAM and, a DDD (Double 
Diffused Drain) structure is applied for nMOS 
in the gate array. The gate length is reduced 
in the SRAM to reduce the size of the memory 
cell to 66.6 um?. 

In the source/drain contacts of the nMOS, 
the contact resistance is lowered using the same 
doped polysilicon technique as that used in the 
emitter contact. The process parameter is shown 
in Table 1. 

The fabrication process steps are shown in 
Fig. 2. 
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Fig. 1—BiCMOS device structure. 


Table 1. Process parameter 


Item 


Epitaxal thickness 


Specification 


1.6 um 


Gate oxide thickness 


25 nm 


Field oxide thickness 


500 nm 


-DOPOS thickness 


100 nm 


Minimum emitter size 


1.1 x 1.9 pm? 


Gate length/ drain structure 
(nMOS) 


1.2 um/LDD (SRAM) 
1.5 um/DDD (Gate array ) 


CMOS 
(Base) 


p substrate 


Common 


Bipolar 


(Added) 


p-well 
Channel stop 
LOCOS isolation 
Gate oxidation 
Gate poly Si 
(LDD n-) 
Source drain n* 


Source drain p* 


Isolation 


2nd poly Si 


Al metalization 


Through poly Si 


Emitter diffusion 


Fig. 2—Fabrication process steps. 
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Fig. 3—Bipolar npn transistor gummel plot. 


A p-well is simultaneously formed with the 
isolation layer. Likewise, a graft base is formed 
with the pMOS source/drain layer, the poly- 
silicon layer of the emitter contact is formed 
with that of the nMOS source/drain contacts, 
and the emitter and the nMOS source/drain 
contacts are diffused by simultaneously reflow- 
ing the PSG layer formed over the polysilicon. 
Consequently, compared with the conventional 
CMOS process, the additional process steps 
required for BiCMOS are only those including 
buried layer fabrication, epitaxial growth and 
ion implantation for the collector and base. 


3. Device characteristics 

A typical gummel plot of 1.1 x1.9 um? 
emitter transistor is shown in Fig. 3. This graph 
shows ideal I-V characteristics for both the 
collector and base currents from the injection 
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level. The current gain (App ) is 100, the break- 
down voltage between the emitter and collector 
(BVcrgo) is 8 V and that between the base and 
emitter (BVggga) is 5.6 V. 

The dependence of the cutoff frequency (ff ) 
on the emitter current of the emitter transistor 
measuring (4 x 62) x 2 um? is shown in Fig. 4. 
The maximum cutoff frequency (ftmax) at 
6 GHz is obtained at an emitter current of 
20 mA. The dependence of the drain current on 
the gate voltage of the nMOS (gate length of 


SE = (4 X 62) X 2 pm? 


ty (GHz) 


1 10 100 
ZT, (mA) 


Fig. 4—Bipolar npn transistor cutoff grequency vs 
emitter current. 
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Fig. 5—nMOS transistor Vg-/p. 
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Table 2. BiCMOS device dimensions and characteristics 


Tran- Specifications 
sistor ‘Gate array!) SRAM 
7 ae | | 1M 1x19 
_ Min, emitter size 1.4 x4 pm?| ~~" pm: 
|Dimen- ~— = i ae 
ine | Base depth 0.3 um 
| Emitter depth 0.15 um 
ae a ; = : : 
BIP Are : 100 
Charac- BY cro __#¥ 
| teristic BVEBo 5.6 V 
Cutoff frequency, 6 GHz 
Gate oxide 
| thickness 25om 
Gate 1.5 um 1.2 um 
| length (DDD) (LDD) 
thes a NMOS|\=ca-ea aa 
Dimen- Junction 
\alcens depth 0.4 um 0.25 um 
Gate - 
length 1.8 um 1.4 um 
ae Junction 0 mn : - 
| depth oe pee 
— ——}———__— . a ss 
MOS Vru 0.6 V 0.65 V 
a fe — 
nMOS £6 1 100 wS/V}|1 200 uS/V 
Charac- Vsp 14V 
teristic Vr 0.6 V 0.75 V 
pMOS 6 400 uS/V 
\Vsp 14V 
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Fig. 6—pMOS transistor Vg-/p. 
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Fig. 7—Photograph of a chip with a 2 000-gate gate array. 


1.2 um, gate width of 20 um) is shown in Fig. 5 
and that for the pMOS (gate length of 1.4 um, 
gate width of 20 um) is shown in Fig. 6. In the 
nMOS £6 and BVgp are 1200 uS/V and 14 V, 
respectively, and 400 uS/V and 14 V for the 
pMOS. 

The device characteristics obtained using 
BiCMOS technology are nearly equal to those 
obtained using the conventional bipolar and 
MOS technology individually. The main device 
characteristics are shown in Table 2. 


4. Application 
4.1 2 000-gate gate array 

Figure 7 is a photograph of the developed 
2 000-gate gate array. 

The dependence of the switching speed on 
the load capacitance is important for high-speed 
operation of a device. In the BiCMOS gate 
array, this dependence is improved by adding a 
drive circuit which consists of the bipolar 
transistor at the output circuit of the CMOS 
logic. An additional feature of the BiCMOS is 
that the off-state current of the CMOS output 
circuit to be used is nearly zero, enabling low 
power to be maintained. As the scale of inte- 
gration becomes larger, the wiring length of the 
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Fig. 8—A 3-input NAND gate. 


gate array increases. It is necessary to further 
improve the switching speed dependence on the 
load capacitance in order to organize high-speed 
systems. 

The basic circuit of the BiCMOS logic can 
be constructed in two types. One type uses a 
MOS transistor’? and the other type uses a 
resistor as a impedance which serves to discharge 
the charge stored in the base of bipolar tran- 
sistor. The authors selected the type having the 
resistor impedance, as shown in Fig. 8. This 
basic circuit consists of an nMOS (gate length of 
1.5 um), pMOS (gate length of 1.8 wm) and npn 
(emitter size of 1.4 x 4 pm?). 

For the 2-input NAND gate, the measured 
gate delay times (fpq) versus the load capaci- 
tance are shown in Fig. 9. Under a standard load 
capacitance (wiring length of 3 mm, FO of 3), a 
speed was obtained which is twice as fast as a 
CMOS state which consisting of transistors of 
the same size. The dependence of gate delay on 
load capacitance is 0.3 ns/pF. This value is about 
1/5 of the CMOSs gate delay. 

In the range below 0.1 pF of the load 
capacitance, the gate delay of the BiCMOS is 
larger than that of CMOS, which is due to 
an influence of the parasitic capacitances be- 
tween the base and collector (CCB) and between 
the base and emitter (CEB). Therefore, for a 
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5 °C at 2-input NAND 


CMOS 


BiCMOS 


Standard load capacitance 
t 


0 0.5 1.0 1.5 2.0 
Ci, (pF) 


Fig. 9—Propagation delay time vs load capacitance. 


Table 3. Characteristics of 2 000-gate gate array 


Item Specification 
Delay time|“1= 9 PF | (05S a 
(ns) | Standard load 08 

Internal capacitance ; 

eats Power dissipation (mW) gs" * 
Toggle frequency (MHz) 180 

: Delay time|‘PLH 1.7 

nput ns) | : | sg 

buffer _@) lim | op 
Power dissipation (mW) 9.3520 2 
Delay time | PLH -_ 4.0 4.0 = 

Output (ns) TPHL 5.3 6.3 ; 

buffer | Power dissipation (mW) | g.9°°” | ace . 
To. (mA) 24 10 


note 1; Dynamic state (at 10 MHz) note 2: Static state 


Column decoder 


circuit to achieve high speed and be highly 
integrated, with a large load capacitance, the 
BiCMOS circuit is most suitable. 

The characteristics of this gate array are 
summarized in Table 3. The BiCMOS achieves 
high performance, and is faster than the CMOS 
while having the same power dissipation as the 
CMOS. 


4,2 256-Kbit SRAM 

The power dissipation of a bipolar 64-Kbit 
(ECL) SRAM is normally about 1.0 W® and 
that of a bipolar 256-Kbit (ECL) SRAM is 
expected to be almost 2 W. Therefore, simply 
following this trend, a special technique for 
mounting and cooling is required if integration 
of devices is to advance further. If a high-speed 
ECL SRAM having large-scale integration is to 
be extensively used, it is necessary to achieve 
low power dissipation using BiCMOS technology. 
The BiCMOS 256-Kbit SRAM is therefore 
developed in this work. 

A photograph of the fabricated 256-Kbit 
SRAM is shown in Fig. 10. The chip is 9.36 x 
4.46 mm? and can be mounted on a 300-mil 
DIT package. 

To optimize the performance of each part 
of the circuit, a _ polysilicon-loaded nMOS 
memory cell is used in memory cell, BiCMOS 
circuits are used in gate circuits of the decorder, 


Cell array 


Fig. 10—Photograph of a chip with a 256-Kbit SRAM. 
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Fig. 11—Photograph of memory cell. 


Addrss input Data output 


Fig. 12—Switching waveforms. 


etc., and an ECL circuit is used in input/output 
buffer. Using a double-polysilicon structure as a 
polysilicon-loaded nMOS memory cell, enables 
a fine memory cell size of 66.6 um? to be 
obtained. The photograph of the memory 
cell is shown in Fig. 11. 

The switching waveform in Fig. 12 shows 
that access time is 10 ns with a power dissipa- 
tion of 500 mW at 80 MHz. The characteristics 
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Table 4. Characteristics of 256-Kbit SRAM 


eee 


Item Specification 
“Address access time | ‘10 ns : 
“Write pulse width ‘i 5 ns ——- 
“Power dissipation | 700 mW at 80 MHz 
_Power supply = ee ek 
1/O level [| ECLIOK ~~ 


—_|__ 


Organization 256 Kword x | bit 


Redundancy 4 rows, 8 columns 
Cell size 66.6 um? 
Chip size 9.36 x 4.46 mm? 


of this SRAM are summarized in Table 4. 


5. Conclusion 

A high-speed BiCMOS process technology, 
consisting of a bipolar process technology 
using a polysilicon emitter and a CMOS process 
technology using 1.0 um-rule, has been de- 
veloped. The high-speed characteristics of 
the BiCMOS were obtained, as the cutoff 
frequency (f;) of the bipolar npn transistor is 
6 GHz and the propagation delay (tpq) of the 
CMOS gate is 0.5 ns. The high-performance of 
conventional bipolar device and CMOS device is 
also maintained. 

When applied to a 2 000-gate gate array, a 
speed about twice that of a CMOS is obtained, 
as the propagation delay (fpq) is 0.8 ns in 
a 2-input NAND gate with standard load capaci- 
tance, and the power dissipation is 0.25 mW/ 
gate, nearly the same as that of the CMOS. 
When applied to a 256-Kbit SRAM, an access 
time of 10 ns and power dissipation of 500 mW 
are obtained. 

In the present BiCMOS process, further 
improvement in the characteristics of both 
bipolar transistor and MOS transistor can be 
achieved. A Shallower base layer and source/ 
drain layer can be made to increase the cutoff 
frequency and transcondactance, provided a 
lower temperature process is developed. 


Reference 

1) Yamauchi, T., Wakui, Y., Inayoshi, K., Tsuchiya, 
C., and Tokuriki, M.: 20V BiCMOS Technology 
with Polysilicon Emitter Structure. 1987 Electro- 


389 


H. Fukuma et al.: High-Speed BiCMOS Technology with: - - 


chem. Soc., Spring Meet. Abstract, 286, pp. 419- 
420. 

2) Tsuchiya, C., Ono, A., Tamada, H., Yamauchi, T., 
Usui, Y., and Oshio, U: A BI/CMOS Analog Master 
Chip with Versatile Macro Cell. 1987 Electrochem. 
Soc., Spring Meet. Abstract, 274, pp. 339-400. 

3) Ikeda, T., Nagano, T., Momma, N., Miyata, K., 
Higuchi, H., Odaka, M., and Ogiue, K.: Advanced 
BiCMOS Technology for High Speed VLSI. 1986 
IEEE Int. Electron Devices Meet. Tech. Dig., 
pp. 408-411. 

4) Iwai, H., Sasaki, G., Niitsu, Y., Norishima, M., 
Sugimoto, Y., and Kanzaki, K.: 0.8 um BiCMOS 
Technology with High fy Ion-Implanted Emitter 
Bipolar Transistor. 1987 IEEE Int. Electron Devices 
Meet. Tech. Dig., pp. 28-31. 

5) Fukushi, I., Okajima, Y., Maki, Y., Ishii, Y., 
Nomura, O., Toyoda, K., Yamauchi, T., and 


Hiroyuki Fukuma 


Process Engineering Dept. 

Bipolar Division 

FUJITSU LIMITED 

Bachelor of Electronics Eng. 

Tottori University 1980 

Master of Electronics Eng. 

Tottori University 1982 

Specializing in Bipolar and BiCMOS 
Devices 


Tsunenori Yamauchi 


Process Engineering Dept. 
Bipolar Division 

. FUJITSU LIMITED 

’ Bachelor of Electrical Eng. 


Nishinippon Institute of Technology 1971 


Master of Electronics Eng. 
Kyushu Institute of Technology 1974 
; Specializing in Bipolar and BiCMOS 


Devices 


390 


Fukuma, H.: A 256K ECL RAM with Redundancy. 
1988 IEEE Int. Solid-State Circuits Conf. Dig. 
Tech. Pap., pp. 134-135. 

6) Takagi, M., Nakayama, K., Terada, C., and Kamioka, 
H.: Improvement of Shallow Base Technology by 
Using a Doped Poly-Silicon Diffusion Source. 
Suppl. Jpn. Soc. Appl. Phys., 42, pp. 101-109 
(1973). 

7) Watanabe, T., Ikeda, T., Nagano, T., Momma, N., 
Nishio, Y., Tamba, N., Odaka, M., and Ogiue, K..; 
High Speed BiCMOS VLSI Technology with Burried 
Twin Well Structure. 1985 IEEE Int. Electron 
Devices Meet. Tech. Dig., pp. 423-426. 

8) Okajima, Y., Toyoda, K., Awaya, T., Tanaka, K., 
Nakamura, Y.: 64Kb ECL RAM with Redundancy. 
1985 IEEE Int. Solid-State Circuits Conf. Dig. Tech. 
Pap., pp. 48-49. 


Yoshinori Okajima 


Bipolar Memory IC Design Dept. 
Bipolar Division 

FUJITSU LIMITED 

Bachelor of Physics 

Kyoto University 1981 


Design 


FUJITSU Sci. Tech. J., 24, 4, (December 1988) 


Specializing in High Density Static RAM 


UDC 621.382.33:621.7.04 


Characteristics of Si HBT with 
Hydrogenated Micro-Crystalline 
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An npn Si HBT has been fabricated using hydrogenated micro-crystalline Si as a wide gap 
emitter. It shows much higher common emitter current gain than a conventional homo- 
junction transistor. The measured common emitter current gains of the fabricated HBTs 
having intrinsic base sheet resistance of 14 kQ/Dand 95 2/Dare 1500 and 18, respectively. 
The present HBT can perform normal operation even at liquid nitrogen temperature. 


1. Introduction 

Heterojunction bipolar transistors (HBTs) 
have attracted much attention in recent years 
because of their great potential for future 
high-speed digital and microwave circuit applica- 
tions. A great advantage of HBTs is their low 
base resistance, using a heavily doped base which 
does not reduce the current gain. In the HBT 
structure, the minority carrier back injection 
from the base to a wide gap emitter can be 
strongly suppressed because of a large energy 
barrier in the valence band”. 

An additional advantage of the hetero- 
emitter lies in its low-temperature operation”. 
Low-temperature operation of bipolar devices 
is very attractive because it improves trans- 
conductance and reduces interconnection delay. 
Most Si homojunction bipolar devices, however, 
suffer serious degradation in current gain at 
liquid nitrogen temperature (LNT). The current 
gain drop-off in Si homojunction bipolar devices 
is currently thought to be mainly due to shrink- 
age of the band gap in heavily doped emitters. 
Even if the degradation in current gain could be 
eliminated, carrier freezeout would reduce the 
speed. HBT, on the other hand, is considered to 
be suitable for low-temperature operation 
because it has a wide gap emitter and heavily 
doped base. 
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Several groups working in the Si device 
field have fabricated Si HBTs using wide-gap 
materials such as SIPOS, a-Si:H and a-SiC:H»”®. 
Although they have obtained very interesting 
results in current gain improvement, these 
materials are unsuitable for emitters of scaled 
high-speed LSIs because of their high resistivity. 
The authors tried to use phosphorous-doped 
hydrogenated micro-crystalline Si (uc-Si:H) as 
a wide gap emitter with low resistivity”. The 
uc-Si:H, which is a mixture of hydrogenated 
amorphous and micro-crystalline phases, has 
a low resistivity and a wide band gap®)») 

This paper describes the characteristics of 
a Si HBT with a wc-Si:H emitter at room temper- 
ature and liquid nitrogen temperature (LNT). 


2. Characteristics of uc-Si:H 

The phosphorous-doped wuc-Si:H film was 
deposited at a substrate temperature as low as 
240 °C to 450 °C in a gaseous mixture of SiH,, 
H, and PH. Figure | shows the X-ray diffrac- 
tion pattern of the (111) peak of the we-Si:H 
film. This peak is very broad, showing that this 
film contains a micro-crystalline phase whose 
average grain size is about 5 nm. From IR and 
UV spectra measurements, it is found that this 
film contains hydrogen and has an optical 
band gap of 1.5 eV to 1.9 eV. 
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Si (111) 


Intensity (a. u.) 


24 26 28 30 32 
2 @ (deg.) 


Fig. 1—X-ray diffraction pattern of uc-Si:H film. 


pe-Si: H 


P 102cm 


Fig. 2—Cross-section of uc-Si:H/c-Si heterodiode. 


3. Energy band diagram of the heterojunction 
Before fabricating HBTs with wc-Si:H, 
a pc-Si:H/crystalline Si np heterodiode was 
fabricated to estimate the energy band diagram 
of the heterojunction. A schematic cross-section 
of the diode is shown in Fig. 2. Figure 3 shows 
1/C?-V characteristics of the wc-Si:H/crystalline 
Si heterodiode. One can see that 1/C? is propor- 
tional to V and the slope gives a value close to 
the substrate impurity concentration. These 
results indicate that the one-sided step junction 
model fits this diode well, and that a large 
portion of the depleted layer extends into the 
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1/C2 (* 10-4 mm‘/pF?) 


N, = 1.4 X 10'5/cm?3 


Voltage (V) 


Fig. 3—1/C?-V characteristics of uc-Si:H/c-Si 
heterodiode. 


AE, 0.16 


Collector 


Emitter Base 


Fig. 4—Energy band structure of the wc-Si:H HBT. 


p-type crystalline substrate. The energy band 
diagram of the uwc-Si:H HBT can be estimated 
as shown in Fig. 4 using the built-in potential 
(1.02 eV) derived from this result, activation 
energy (0.03 eV) of the conductivity of the 
pe-Si:H film and optical energy band gap 
(1.81 eV), which is assumed to be approximate- 
ly equal to the electrical band gap. As can be 
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SiO, 


Thermal oxidation 

Boron ion implantation 
(32 keV) 

Thermal anneal for 
activation (900°C, 30 min) 


ue-Si: H 
CVD SiO, deposition 
ue-Si: H deposition 
Al E B 
Electrode fabrication 
iC 


Fig. 5—Schematic diagram of the steps in fabricating 
uc-Si:H HBT. 


seen in this figure, there is a large barrier for 
holes AEF, (0.53 eV) which is expected to 
reduce the base current. 


4. Transistor characteristics 

A conventional Si bipolar process was used 
for the HBT fabrication, as represented sche- 
matically in Fig. 5. A field oxide was grown 
on a 1 Qcem (111) n-Si substrate, and windows 
were Opened in the oxide. The base regions 
were formed by implanting boron ions 
at 32keV with the doping densities of 
5x10" em? to 1x10 cm. This was 
followed by annealing at 900°C for 30 min to 
activate the implanted species. An interlayer 
dielectric film of SiO, was deposited in which 
emitter windows were opened. The typical 
emitter size is 5 x5 pum?. Immediately after 
dipping the wafer in the HF solution, it was 
put into the reaction chamber of a plasma 
CVD system to deposit the phosphorous doped 
uc-Si:H film. The we-Si:H film, except for the 
emitter region, was then etched ina NF; plasma. 
The base contact windows were opened in the 
SiO, , and the emitter and base electrodes (AI-Si) 
were deposited by sputtering. The doping 
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Table 1. Base doping densities 


Sample pe implant Peak base | Sub-emitter sheet 
ose (cm~ ) doping (cm™ )| resistance (2/ 
#1 | sEi2 | 4.0617 | 136K 
#2 | 5EI3. | 3.6818 |  1.43K 
#3 SE14 3.9E19 | 193 — 
es | tee | aan | 82 
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Fig. 6-Maximum common emitter current gain hppmax 
of the uc-Si:H HBT as a function of measured 
sub-emitter base sheet resistance. 


densities in the base region of the present 
devices are summarized in Table 1. The sub- 
emitter base resistances are also given. 

In Fig.6, the maximum common emitter 
current gain of the present transistors are 
plotted as a function of the measured sub- 
emitter base sheet resistance. As can be seen 
in this figure, the present HBT has a much 
higher current gain than a conventional homo- 
junction transistor. The maximum current 
gain of sample #1 is 1500 with a base sheet 
resistance of 14kQ/O, and that of sample 
#4 is 18 while the base sheet resistance is as 
low as 95 92/O. These results indicate that 
the base resistance of the present HBT can 
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a) Sample #2: base implant dose 5E13 cm ~~ 


b) Sample #4: base implant dose 1E15 cm” 


Fig. 7—Common emitter /,-V,~ characteristics. 
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Fig. 8-Common emitter current gain hyp dependence 


on collector current density of the uc-Si:H HBT’s, 


be reduced while maintaining high current 
gain. Therefore, the pwc-Si:H HBT has the 
potential of surpassing silicon homojunction 
transistors in speed performance. The common 
emitter /,-V,. characteristics of samples #2 and 
#4 are shown in Figs. 7a) and b), respectively. 
Satisfactory transistor operations can be clearly 
observed despite the heavy base dose. The 
dependence of current gain on collector current 
is shown in Fig. 8. The maximum current 
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Fig. 9—Characteristics of collector current and base 
current asa function of V,, for sample #2. 


gain is obtained at a collector current as high 
as 3x 10* A/cm, which is quite satisfactory 
for scaled high-speed LSIs. Figure 9 shows 
the characteristics of the collector current 
IT, and base current /, as a function of base- 
emitter voltage. The ideality factors n, for the 
collector current and base current are about 
1.0 and 1.7, respectively. This result indicates 


FUJITSU Sci. Tech. J., 24, 4, (December 1988) 


Temperature (°C) 


— 10( =15 —2 
01 100 150 200 


5 
= 0.05 
5 
DB 
77) 
uv 
~ 
0.02 
5} 4 6 8 10 12 14 
Inverse absolute temperature (1 000/K) 
Fig. 10—Resistivity of uc-Si:H film vs. inverse 
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Fig. 11—Contact resistance of uc-Si:H with Al vs. 
inverse temperature. 


that J, and Jy are dominated by the diffusion 
current and the recombination current, respec- 
tively. The degradation of current gain at a low 
collector current density, as seen in Fig. 8 is 
caused by this recombination. The wc-Si:H HBT 
is expected to have a higher current gain if the 
interface recombination center can be reduced. 
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Fig. 12—Common emitter /,-V, characteristics of the 
uc-Si:H HBT. 


5. Low-temperature operation 

Before the transistor characteristics were 
measured at low temperatures, the increase 
in the emitter resistance at low temperatures 
was estimated. Figure 10 shows the tempera- 
ture dependence of the resistivity of the 
uc-Si:H film. The resistivity of the mc-Si:H 
film at LNT is increased only by a factor 
of two. Hall measurements show that this 
increase in resistivity is due to a degradation 
in electron mobility. Figure 11 shows the 
temperature dependence of the contact 
resistance of the mc-Si:H with AI-Si. The 
contact resistance is increased by less than a 
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Fig. 13—Maximum current gain of the uc-Si:H HBT vs. 
inverse temperature. 


factor of two. These results indicate that the 
increase in the emitter resistance of the present 
HBT is expected to be small. The base resistance 
of the measured sample is about 5.0 kQ2/O. 
Figure 12 shows the J,-V.. characteristics 
of the uc-Si:H HBT at room temperature and 
LNT. Note that the wc-Si:H HBT operates 
normally even at LNT. The variation of the 
maximum common emitter current gain with 
temperature for the present HBT is shown 
in Fig. 13. The temperature dependence of 
the maximum current gain for the present 
HBT is much smaller than what is normally 
observed in homojunction transistors, in which 
the gain decreases exponentially with decreasing 
temperature. This result indicates that hole 
injection from the base into the emitter is 
suppressed by the effect of the wide gap emitter. 
This successful low-temperature operation of 
the ywc-Si:H HBT is expected to be promising 
for future high-speed bipolar and Bi-CMOS 
LSIs. 


6. Conclusion 


An npn Si HBT has been fabricated using 
mc-Si:H as a wide gap emitter, showing much 
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higher current gain than a conventional homo- 
junction transistor. Therefore, the base 
resistance of the wc-Si:H HBT can be reduced 
while maintaining high current gain. Since the 
mc-Si:H has low resistivity, the wc-Si:H is well 
suited to scaled high-speed LSI applications. 
The present HBT can be operated normally 
at LNT. 
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The characteristics and the proximity effect of resist in electron-beam direct-writing was 
studied to form submicron patterns. MOSFETs were obtained with an effective channel 
length down to about Le = 0.32 um. The transistors fabricated using this technique operate 
well without punch-through. By evaluating the dispersion of many transistors, it was found 
that there is a strong possibility that devices with a minimum pattern size of 0.2-0.3 um can 


be manufactured for practical use. 


The influence of the electron-beam direct-writing on the reliability of devices was studied 
and it was confirmed that this method is sufficiently reliable when used for gate-electrode 


formation. 


1. Introduction 

The scale of integrated circuits (ICs) has 
been rapidly increasing, and today, the 64M-bit 
DRAM is attracting considerable attention. 
A very wide range of technologies is required 
to improve the scale of integration. An 
integrated circuit does not simply involve 
the integration of devices but can also mean 
the integration of technologies. Lithography 
is an indispensable part of IC manufacture. 

Until today, the pattern delineation has 
depended entirely on optical lithography. 
However, as integration increases and_ sub- 
micron patterns are required, we are approach- 
ing the limit of optical lithography. Lithogra- 
phies using electron beam (EB) or X-rays have 
been developed to replace optical lithography. 
Since the EB can be finely focused and po- 
sitioned, EB direct-writing technology (in which 
patterns are directly drawn on the wafer without 
masking) is the most reliable submicron-pattern 
formation technique available. 

However, a_ highly accurate submicron 
pattern may not be guaranteed even with the 
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EB direct writing. There are many factors 
that influence the pattern accuracy. The charac- 
teristics of resist materials play an especially 
important role. The resolution of negative resists 
is degraded by _ post-polymerization effects, 
in which cross-linking continues and_ the 
dimensions of patterns change even after EB 
irradiation. The proximity effect, caused by 
electrons reflecting back off the substrate, 
is also a problem. Furthermore, a high dry 
etching resistivity is also required to transfer 
a pattern to the underlying materials with high 
precision. In this work, these problems were 
studied, and as a result, highly accurate patterns 
have been achieved. These patterns were 
achieved by selecting single-layer chloromethyl- 
ated polystyrene (CMS) as a resist, by opti- 
mizing the molecular weight of the resist, and 
by correcting the proximity effect. 

A transistor is the most important element 
of an integrated circuit, and as integration 
increases, its dimensions are being reduced. 
The physics of a MOSFET change as _ its 
dimensions are reduced. A submicron MOSFET 
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shows unique characteristics. Therefore, it is 
important to know the behavior of submicron 
devices made with high dimensional precision, 
which is not attainable by conventional optical 
lithography. 

For this reason, MOSFETs with fine poly- 
silicon gates were fabricated by EB direct 
wiring, and the influences of the lithography 
precision on the uniformity of electric charac- 


Table 1. Properties and characteristics of CMS at 20 kV 


Chloro- | Sensitivity Resolu- 
My, My} | methyl- (uC/em") y-value | tion 
M, | ation L —— | een) 
ratio(%)| Do | Dso Hf 
= pee | sree ie [Mea | = 
53 000 | 08] 18] 15 | 21 
13 000|< 1.08 9.0 |120] 4.1 is. 
4.7 


My : Weight average molecular weight 

M, _ : Number average molecular weight 

Do _ : Critical dose of gelation 

Dsq_ : Dose at which a half the initial thickness remains 


Y :7=1/2 log(Dso/Do) ~ 


8 20K & 


a) 10-um wide space, M,,: 7500 


teristics of the MOSFETs were evaluated. 
The long-term reliability of the devices ir- 
radiated by an electron-beam and the behavior 
of devices in the deep submicron region of less 
than 0.5 um were also evaluated. 


2. Fabrication process for submicron MOSFET 
2.1 Electron beam resist 

Chloromethylated polystyrene (CMS) is 
well known as a dry etching-resistant negative 
resist free from post-polymerization. TOSOH 
CMS-EX is a high-sensitivity resist and CMS- 
EXR is used for high resolution. As it is im- 
possible to form a submicron pattern using 
these resists, a resist of higher resolution is 
required. Generally, negative EB resists show 
that resolution increases with decreasing 
molecular weight. Therefore the molecular 
weight of CMS was studied in regard to the 
formation of quarter-micron patterns. 

Table 1 shows the properties and character- 
istics of the three types of CMS evaluated in 
this paper. The resolution represents the mini- 


RLB*S 


221513 20KV 


b) 1.5-um wide space, M,,: 13 000 


Fig. 1—SEM photograph of 3.0-um wide line of CMS. 
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CMS: 0.4 wm thickness 


Line width 


3.0 wm 


Electron dose (uC/cm*) 


2.0 wm 


0.6 um 1.0 um 


0 1.0 2.0 3.0 
Line width/2 (um) 


Fig. 2—Exposure intensity curves of 0.4-um thick CMS 
for various designed widths. 


mum width achieved between the 3-um wide 
lines. The initial resist was 1.2 wm thick and 90 
remained after development. The 
sensitivity decreases as the molecular weight 
decreases, but the y-value shows that the resolu- 
tion increases. Figure | shows the SEM photo- 
graphs of the narrowest space and shows the 
3-um wide lines using molecular weights of 
7500 and 13 000. Figure 1 shows that lowering 
the molecular weight improves the resolution. 
However, since an extremely low molecular 
weight causes poor heat resistance (low 7,) 
and long exposure time (large Do), a CMS 
having a molecular weight of 7500 was used 
in the following experiments. 


percent 


2.2 Correction of the proximity effect 

The proximity effect is a serious problem 
in electron beam lithography. Incident electrons 
are scattered on the resist and substrate. Some 
electrons penetrate the substrate through the 
resist, and are then scattered back onto the 
resist. 

Consequently, the resist is exposed both 
by incident electrons and by backscattered 
electrons. If patterns are close together, the 
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Table 2. Correction parameters for CMS 


Thick- | Accelera- | Sub- 
ness | tion energy| strate }|———_______- — 


Parameters 


(um) (kV) | materials A B Cc 


0.001 4 | 3.500 


resist is also exposed to backscattered electrons 
from other patterns. This phenomenon is called 
the proximity effect and causes a degeneration 
of pattern accuracy. To correct this problem, 
it is necessary to improve the accuracy. 

T.H.P. Chang?) has reported that the ex- 
posure intensity distribution (EID) can be close- 
ly approximated by the sum of two Gaussian 
distributions. 

In this experiment, the EID was expressed 
as follows: 


f(r) = exp {- (*/A + 
+ Bexp {- (r/c)} 


The first term represents the effect of the 
incident primary beam and the second is that 
of backscattered electrons. Parameter A 
represents the horizontal spread of an incident 
primary beam, B represents the intensity ratio 
of two Gaussian distributions, C represents 
the horizontal spread of backscattered electrons, 
and r represents the distance from the center 
of the incident beam. 

These parameters depend on the acceleration 
voltage of the electron beam, resist and substrate 
materials, resist thickness, and the development 
conditions. 

In order to obtain these parameters, a 
0.4-um or 1.0-um CMS was spin-coated onto 
silicon substrates. The resist was then exposed 
at an acceleration voltage of 30 keV. Patterns 
20 um long and 0.6, 1.0, 2.0 and 3.0 um wide 
were exposed on each wafer with a dosage 
ranging of 10-1000 uC/cm?. Figure 2 shows 
four exposure intensity curves for the 0.4-um 
thick resist. The term f(r) in Equation (1) 
expresses an EID for a small spot beam. EIDs 
for various patterns are obtained by integrating 
f(r) with the irradiated patterns as F(R)= 
SSf (r)dx x dy» where R is the distance from 
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0.2 


0.4 um thickness Dose 
@ : 8&0 uC/cm? 
@ : 70 4C/cm? 
a& : 60 uC/cem? 


0.1 


Deviation (”m) 


—(0.1 


—0.2 
0.1 0.3 1.0 3.0 10.0 


Line width (”7m) 


Fig. 3—Deviations of developed resist width as a function 
of designed width. 


the center of the irradiated patterns. Parameters 
A, B, and C were determined by matching the 
F(R)s to the measured exposure intensity 
curves shown in Fig. 2. Table 2 shows the results 
obtained by this method. 

In Table 2, parameters A and B decrease 
with decreasing resist thickness. A small value of 
A indicates a small horizontal spread of front 
scattering at the resist-substrate interface. A 
small value of B indicates that the exposure was 
performed by incident electrons rather than 
backscattered electrons. Therefore, reducing the 
resist thickness is an effective way to improve 
resolution. 

Y. Machida et al.” developed a proximity 
effect correction which had two modes. One 
correction was for the dose and the other was 
to correct the pattern dimensions. In this 
experiment, the correction method was applied 
to submicron-gate patterns. The dose and 
dimension of each pattern were calculated 
using the values of parameters A, B, and C 
obtained above. 

The fabrication of small gate patterns using 
a 0.4-um thick CMS is described below. After 
coating, the resist was prebaked for 100s at 
80°C ona hotplate. The resist was developed 
by a drip method using a solution of | part of 
isoamil acetate to nine parts ethyl cellosolve 
as a developer and using isopropyl alcohol as 
a rinse. 

Figure 3. shows the deviation from the 
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CMS: 0.4 wm thickness 


Dose: 70 »C/cem? 
Poly-silicon after etching 
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Deviation (um) 


—0.1 


—0.2 
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Line width (#m) 


Fig. 4—Deviations of developed resist width and etched 
polysilicon film width as a function of designed 
width. 


designed dimensions for line patterns down to 
0.2-um wide. The doses shown in the figure 
are basic exposure doses. The doses for large 
patterns and the actual exposure doses are 
determined according to the pattern dimensions 
and the proximity effect correction. The dashed 
line in Fig. 2 indicates the doses for various 
design widths, and shows that a smaller pattern 
requires a higher dose. 

Errors in the formed pattern width were 
within the range of +0.04 um at 7 x 1075 C/cm? 
for deisgn dimensions of 0.2 um to 10 yum. 
An accurate pattern delineation was achieved 
using the proximity effect correction. The 
pattern was measured by a HITACHI S-6000 
SEM dimension measuring instrument. 


2.3 Submicron gate formation 

Polysilicon etching was carried out using 
a parallel-plate cathode-couple machine. 

SF, gas was first studied. Then a non-doped 
0.4-um thick polysilicon film was etched using 
a 1.0-um thick resist film. In the case of pure 
SF,, the selectivity of polysilicon to silicon 
dioxide is ten or more and that to the resist 
is three. However, an undercut of about 0.1 um 
was observed. In order to eliminate the 
undercut, C,CIF; was added to the SF, to 
protect the side wall. As a result, the undercut 
was controlled but the selectivity to the oxides 
was reduced to four. It was therefore concluded 
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Fig. 5—Cross sectional SEM photograph of 0.25-um long 
polysilicon gate (Polysilicon thickness: 400 nm). 


that it is impossible to form submicron gate 
electrodes with SF, . 

Br, was then studied as an etching gas 
because it reportedly maintains a high selectivity 
during etching. The etching conditions were as 
follows: 250 W RF power, Br, gas of 70 sccm 
and He gas of 70 sccm, pressure of 0.2 Torr, 
and 10 percent over-etching. A  non-doped 
0.4-um thick polysilicon film was etched using 
a 0.4-um thick resist film. The selectivity of 
polysilicon was found to be 30 times that of 
oxide and 10 times that of the resist. The 
finished sample showed quite a steep wall 
without undercut. 

Figure 4 shows the error in polysilicon 
width compared with the resist width. The 
polysilicon is about 0.05 um wider than the 
resist width. This anomalous increase in the 
width is believed to be caused by the protec- 
tion of the side wall. 


4) ,5) 


2.4 Device fabrication process 

MOSFETs were fabricated using the tech- 
niques explained above. These MOSFETs 
had polysilicon gate electrodes about 0.25 um 
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0.10 
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(um) 
$ 


—(0.05 


Laesign (um) 


Fig. 6—Variations of gate-length L gate from design length 
L design « 


to 20 um in length. Electron-beam direct writing 
was used only for the gate pattern delineation, 
while conventional optical lighography was used 
for the other operations. 

The fabrication conditions were as follows: 
p-type wafers of 
10 Q-cm, the channel regions were implanted 
with boron ions to a dose of 1, 2, or 5 x 
1012 cm~? at 40 keV, and a 20-nm gate oxide 
was grown using HCl oxidation. The source 
and drain were formed by arsenic ions at 
3x 10!5 cm-?. at 7OkeV. Figure 5 shows 
an SEM photograph of a 0.25-um long gate. 

When the polysilicon gate was patterned, 
gate lengths of eleven transistors were measured 
along one diameter of a four-inch wafer for each 
design length. The results are shown in Fig. 6. 
For a design length of 0.20 um, the average 
measured lengths were 0.25 +0.02 um. The 
deviation was within +10% for each length. 


Substrates were silicon 


3. Submicron MOSFET 

This chapter discusses the characteristics 
of the MOSFETs fabricated by the lithography 
mentioned above. Figure 7 shows the /p-Vp 
characteristics of a device with a channel doping 
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Drain current /) (mA) 


Drain voltage Vps (V) 


Fig. 7—Ip-Vp characteristics of an MOSFET of 
Letp = 0.32 um (LZ design = 0.3 um, 
channel-doping: boron, 5 x 10!? cm), 


tb 
Subthreshold swing S (V +decade™!) 


“0 0.5 1.0 15 
Gate length Lyare (um) 


Fig. 8—Threshold voltage shifts and subthreshold swing 
shifts as short channel effects. 


of 5x 10!2 cm? and Lege of 0.32 um. No 
punch-through characteristics were observed 
and operation was stable. 


3.1 Short-channel effect 


When the channel length of MOSFETs is 
reduced, electric charges in a channel are 
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Table 3. Variations of threshold voltage 


Gate length Threshold voltage 
Designed Mean | gin | oin 
(measured) (um) (V) a chip (V) a wafer (V) 

10 (-) 0.456 | 0.001 | 0.018 
0.4 (0.46) 0.369 0.006 0.023 
0.3 (0.36) 0.263 0.008 0.061 
0.2 (0.25) 0.043 0.023 0.130 


affected by the fields formed by the gate and 
by the fields formed by the source and drain. 
One of the short-channel effects is a lowering 
of the threshold voltage, Vi,. The gate voltage 
increment required to increase the drain current 
by a factor of ten in a weakly inverted state 
with a gate voltage less than or equal to the 
threshold voltage is called the subthreshold 
swing, (S). S increases as the channel width 
decreases. Figure 8 shows the threshold voltage 
and the subthreshold swing of devices with 
a channel doping up to | x 10!? cm~?. The 
threshold voltage was determined by extra- 
polating the Jp-Vg characteristics in a linear 
region with Vp =0.1 V. The subthreshold 
swing S was measured with Vp =3 V. It can 
be seen that Vj, decreases and S increases as 
the gate length is reduced below 0.6 um. The 
decrease in the threshold voltage and the 
increase in the subthreshold swing can be 
reduced by increasing the substrate concentra- 
tion of impurities or by reducing the gate oxide 
thickness according to the scaling rule. However, 
since the device suffers lower drain-breakdown 
voltage, increased substrate-bias effect and 
lower reliability of the gate insulator, practical 
devices should be operated under conditions in 
which short-channel effects are minimized. 
The accuracy of the threshold voltage must 
be assured by precise control of the gate length. 
Dispersions of the gate length of one chip 
(8 x8 mm?) and of chips on a wafer were 
studied for the devices represented in Fig. 8. 
The results are summarized in Table 3. 

The standard deviations of a chip near 
the center of a wafer were obtained for 
20 transistors. The standard deviations were 
obtained for 1 000 to 1 700 transistors random- 
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Fig. 9—Channel length dependence of substrate current. 


ly positioned over a four-inch wafer. The 
standard deviation fora Vy, of 61 mV in a wafer 
of 0.3-um devices is about 0.03 um, as shown 
in Fig. 8. This is nearly equal to the previous 
figure of +0.02 um for eleven samples. There- 
fore, if the tolerable precision of the gate length 
is ten percent, the present technology can be 
applied to device fabrication at a level of 0.2 um 
or 0.3 ym. 


3.2 Hot carrier effect 

This section discusses the problem of high 
electric fields in a miniaturized device, especially 
the hot carrier effect. Today, as we enter the 
era of the submicron device, Drain Avalanche 
Hot Electrons (DAHCs) generated at the drain 
edge often cause problems when relative high 
drain voltages are applied. Electrons having 
a high energy from a strong electric field in 
the channel region near the drain collide with 
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Fig. 10—Device life as a function of substrate current 
(T=t Agm/&mo = 10 percent, Vg = 1/2 Vp). 


lattice atoms and generate electron-hole pairs. 
Avalanche multiplication occurs when the 
generated electrons and/or holes create more 
and more pairs. These particles become hot 
and penetrate the gate-oxide layers. In order 
to study device degradation caused by DAHCs, 
the substrate current is measured in relation 
to the amount of generated DAHCs. The 
relationship between the substrate current 
and the device degradation (e.g. the change 
in transconductance g,,) has been thoroughly 
investigated. 

From observation, the life of a device 7 is 
represented” by 


T= Di Jeun) © s 


where J.yp is the substrate current and p is a 
constant of about 3.0 to 3.4. Jp is related 
to the effective channel length Lete by 


Isyp = Eexp (Lex J 


Figures 9 and 10 show the above relation- 
ships. In Fig. 10, the life 7 is defined as the time 
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in which the transconductance g,, falls by ten 
percent. The substrate current was controlled 
by varying the drain voltage Vp and gate voltage 
Vg to maintain a condition of Vg =1/2 Vp. 
The area near the drain that is damaged by 
DAHCs is believed to be about 0.1 um long for 
a channel length of 0.5 ym to 1.0 um. If this 
value does not decrease when the channel 
length is reduced, the ratio of the damaged 
area to the entire channel of the transistor 
increases. Therefore, if the channel length 
is reduced, the transconductance may 
deteriorate”, However, the channel length 
did not appear to affect the life of devices 
even when the channel was 0.2 um long. There- 
fore, the authors believe that the size of the 
damaged area is to some extent proportionally 
reduced when the channel is shortened, or 
that degradation of the damaged area governs 
conductance of the entire device. 

Figures 9 and 10 show appropriate supply 
voltages for miniaturized devices. If a 10-year 
(3.16 x 108 s) life is required, the supply voltage 
must be reduced to about 2.5 V for 0.2-um 
devices without high-voltage resistant structures 
such as LDD (Lightly Doped Drain). The maxi- 
mum voltage is derived from the voltage that 
yields the maximum _ permissible substrate 
current of 10°77 A-uwm™!. This current is 
determined by extrapolating the linear relation- 
ship on the log/log graph of life 7 verses sub- 
strate current /,,,. However, it has also been 
reported®»” that the degradation decreases 
as the gate-oxide becomes thinner in proportion 
to the gate length. This relationship requires 
further study. 


3.3 Damage caused by electron beam irradiation 

In this study, when the performance of 
devices fabricated by electron-beam direct 
writing is appraised, the effects of damage 
induced by electron-beam irradiation must 
be evaluated. Irradiation of MOS devices may 
cause centers which trap electrons or holes 
in the gate oxide or the silicon-silicon dioxide 
interface. MOS devices are more susceptible 
to this damage than bipolar devices because 
their performance and reliability depend on 
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Fig. 11—LDD-MOSFETs. 


the electrical properties of the gate oxide 
and the interface. The projected range of ir- 
radiated electrons is about 1 wm to 2 yum at 
10 keV to 30 keV of accelerated energy, which 
is usually used in electron-beam exposure. 
This is a very large spread. Therefore, wherever 
there is a level in the layer, gate oxides cannot 
escape exposure to electrons. 

The effects of electron irradiation differ 
with device structures and irradiation condi- 
tions. The threshold voltage of MOSFETs 
is shifted by about minus 1V to 2V_ by 
107° C-cm~?. of irradiation. However, the 
change is immediately recovered by _high- 
temperature annealing and no effects on the 
characteristics of the initial device are found. 
The annealing effect was confirmed in the 
following experiment. 

The MOSFETs in this experiment were 
fabricated with optical lithography. Conven- 
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Table 4. Changes of transconductance 
caused by DC-stress 

SD (Vp =6 V) LDD (Vp = 10 V) 

Irradiated | No irradiation | Irradiated | No irradiation 

4.2% | 4.5% | 5.7% | 56% 

1.2% | 10% | 1.1% 0.7% 


(Late = 0.8 um, Vg = 1/2 Vp, 1000 s) 


tional single-drain (SD) devices were irradiated 
just after polysilicon-gate definition, and LDD 
MOSFETs were irradiated just after the oxide 
spacer formation. The dose for both structures 
was 7.5 x 1075 C-cm~? at 30 keV. To eliminate 
the effects of other processing, reference chip 
without irradiation and a sample chip to be 
irradiated were selected from each adjacent 
row of chips within a wafer. After ion implanta- 
tion for source and drain formation, the wafers 
were annealed in nitrogen at 900 °C for 20 min 
to activate the implanted ions. In the final 
process, they were annealed in nitrogen diluted 
hydrogen for 30min at 450°C. The lightly 
doped regions were formed by a P-implantation 
of 1:.2x10!% cm~? at 35keV, and with a 
0.2-um long spacer. Figure lla) shows the 
threshold voltage Vy, and Fig. 11b) shows the 
transconductance g, of the finished LDD- 
MOSFETs. As _ is apparent from _ these 
histograms, no. significant changes induced 
by electron beam irradiation were observed for 
either value. 

The influence on hot-carrier immuity for 
both SD and LDD samples was also studied. 
For transistors with gates of Lgate =0.8 um, 
SD devices were subject to a DC stress of Vp = 
6V and a Vg =3 V for 1000s, and LDD- 
devices, were subject to Vp = 10 V and Vg = 
5 V for 1000s. The number of samples used 
in the test was 80 to 105 for each condition. 
Changes in transconductance from the initial 
stage are as shown in Table 4. 

No significant difference was found between 
the SD and LDD devices. Therefore, the authors 
conclude that there is no adverse effect of 
electron beam lithography on devices that 
are annealed at a temperature of 900 °C or more 
after irradiation. 
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4. Conclusion 

There is the prospect of using the electron- 
beam direct-writing technique to manufacture 
devices at a quarter-micron level for practical 
use. A CMS resist was selected as the electron 
beam resist because of its high resolution and 
high dry etching resistance. By optimizing 
the molecular weight of the resist and using 
the proximity effect correlation, a MOSFET 
with a polysilicon gate of 0.25 um long was 
successfully fabricated. 

Fabricated MOSFETs having effective gate 
lengths down to Ler =0.32 um operate well 
without the punch-through phenomenon. 
Influences of electron beam irradiation on the 
device reliability were evaluated and it was 
confirmed that electron-beam  direct-writing 
to form gate electrodes will cause no problems 
provided that an annealing process at a high 
temperature of 900 °C or more is carried out. 
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SOl-Device on Bonded Wafer 


@ Hiroshi Gotou @ Yoshihiro Arimoto @ Masashi Ozeki @ Kazunori Imaoka 


The bonded wafer technique to fabricate SOI (Silicon On Insulator) devices has been exten- 
sively studied. This technique has been successfully applied to fabrication of a 64-Kbit 
SOI-DRAM, which exhibits a low soft error rate up to 1/7 that of conventional DRAMs, 
depending on the substrate thickness of the bonded wafer. It was found that the soft error 


depended on the base substrate bias voltage. 


It is also shown that the bonded wafer technique can solve the latchup problem in CMOS, 
and is advantageous when used in the fabrication of SOI! bipolar devices. Two new types 


of bipolar transistor are proposed. 


1. Introduction 

Research and development of Silicon On 
Insulator (SOI) devices has a long history. 
This research has shown that SOI devices have 
many advantages, such as realizing extremely 
high-speed devices and latchup-free CMOS. 
Also, soft errors caused by radioactive rays 
occur less often in these devices? and they 
exhibit a high breakdown voltage”. 

However, these features have only been 
utilized in integrated circuits for special applica- 
tions, and the use of SOI-LSI devices has not 
been widespread. This is because no wafers 
can satisfy the required conditions. To obtain 
SOI-LSI devices, a SOI wafer having the same 
quality as a conventional wafer is required. 
The wafer must also be economical for the use 
of SOI-LSI devices to become widespread. 
Moreover, it is desirable to use conventional 
LSI process technology. If these conditions 
were satisfied, SOI-LSI devices could easily 
be realized. 

The authors believe that wafer bonding, 
SIMOX, and FIPOS are candidates for practical 
use in manufacturing SOI-LSI devices. Of these 
methods, wafer bonding is the most suitable 
to produce wafers for general-purpose devices 
because bonded wafers have the following 
features: 
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1) High-quality wafers as good as conventional 
wafers can be obtained. 

2) The thickness of the buried insulator can 
be freely selected. 

3) The thickness of the active substrate can be 
freely set. 

4) Wafers having the desired resistivity can be 
used. 

5) The conduction type of the base substrate 
under the buried insulator layer can be 
selected. 

6) The cost of the bonded wafer is about 
three times that of conventional wafers, 
but is lower than other types of SOI. 

With these features, the bonded wafer has 
great potential for practical use. To verify its 
potential, the authors made an SOI wafer 
using the wafer bonding technique, and this 
wafer was used to make a 64-Kbit DRAM. 
The prototype DRAM can write to and read 
all bits. The soft error rate was the character- 
istic of the DRAM which improved most of 
all — almost one order of magnitude compared 
to bulk wafers. The reason a DRAM was made 
was because it requires high-quality wafers, 
and almost all other devices can be produced 
using the wafers that are used to manufacture 
DRAMs. 

When making a CMOS device, it is important 
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Table 1: Thickness of substrate 


Thickness Feature 
10 um or less The soft errors decrease. 
5 um or less Complete isolation is easily enabled. 
1 um or less The parasitic capacitance 
decreases and the speed increases. 
0.2 um or less | The mobility increases?), 


to solve the latchup problem. The authors 
have shown that complete isolation can be 
achieved using the bonded SOI wafer enabling 
latchup-free devices to be made. 

BiCMOS demonstrates the potential of 
SOI devices. 

The authors have proposed two types 
of bipolar transistors having a new structure 
that uses thin substrates as the new alternative 
BiCMOS. devices. These bipolar transistors 
use a layer on the buried insulator side induced 
by an electric field as the collector. This 
electrically induced layer was made by applying 
a voltage between the active substrate and the 
base substrate. The resulting transistors operated 
satisfactorily, thus opening a new field of 
application for SOI devices. It was also found 
that the region near the buried insulator has 
excellent crystallinity. 


2. Manufacturing the bonded SOI wafer 

The SOI wafer manufacturing technique 
using wafer bonding is currently under develop- 
ment, and various manufacturing methods are 
being developed. These methods can be classi- 
fied from several points of view as follows” ® 
1) Types of wafers for active substrate 

i) Epitaxial wafer 

ii) Non-epitaxial wafer 
2) Types of inter-layer insulators 

i) Doped silicon oxide 

ii) Silicon thermal oxide 

iii) Other than silicon oxide 
3) Substrate bonding methods 

i) Annealing 

ii) Mechanical pressing and annealing 

iii) Electric-field pressing and annealing 
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Fig. 1—Wafer fabrication process, 
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Fig. 2—Electric field effect on bonding. 


iv) Pulse-field pressing and annealing 
4) Thin layer formation methods 

i) Mechanical-chemical polishing 

ii) Mechanical-chemical polishing after 

chemical thin-layer formation 

The optimum thickness of the active sub- 
strate is determined according to the feature 
of the SOI device to be utilized. 

The features can be classified according to 
the thickness of the active substrate as in 
Table 1. 

At the current state of wafer processing 
technology, the thin-layer formation method 
by mechanical-chemical polishing using the 
conventional wafer is suitable for producing 
active substrates of 3 um or more. The thin- 
layer formation method by chemical etching 
and mechanical-chemical polishing using the 
epitaxial wafer as the active substrate is suitable 
for producing active substrates of 5 wm or 
less, 

The pulse-field assisted bonding method 
(PAB) developed by the authors is explained 
below and is illustrated in Fig. 1. Two conven- 
tional wafers, or an epitaxial wafer for the 
active substrate and a conventional wafer for 
the base substrate are thermally oxidized to 
form an oxide layer 0.5 wm thick. These wafer 
surfaces are then bonded using a carbon heater 
at a reduced pressure atmosphere of 107! Pa 
while applying a pulse voltage between the 
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Fig. 3—Silicon pillar on SiO, layer. 


wafers to provide an electrostatic force. The 
bonding temperature is 800 °C. The amplitude 
of the pulse voltage is 300 V, its width 100 ms, 
and its period 500 ms. A pulse voltage is used 
to prevent the voltage drop caused by the 
electric breakdown of the inter-layer insulator. 
A short application of pulses makes it difficult 
for the electric breakdown to occur. Powerful 
bonding takes place within a short time im- 
mediately after the pulse voltage is applied. 

Figure 2 shows how the bonding occurs 
when the pulse voltage is applied compared 
to when it is not applied. After bonding, the 
wafer surfaces are annealed for 30min at 
1 100 °C to increase the bonding strength. When 
the CZ wafer is used as the active substrate, the 
SOI wafer is subject to mechanical-chemical 
polishing. When the epitaxial wafer is used 
as the active substrate, the layer is chemically 
thinned and subject to mechanical-chemical 
polishing. 

The macroscopic bonding strength of 
a wafer produced in this way is 100 kg/cm? or 
more. However, it is extremely difficult to 
measure the local bonding strength. To evaluate 
this, 4x 4 um Si poles were made on a 7-um 
thick active substrate and the poles were washed 
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Fig. 4—Raman spectrum for a bonding wafer. 
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Fig. 5—Cross sectional TEM lattice image. 


with water (see Fig. 3). No wafer peeled during 
this treatment. This shows that the wafer has 
sufficient bonding strength, even at the micro- 
scopic level. Figure 4 shows the residual stress 
of the active substrate measured by Raman 
spectroscopy. The residual stress is below 
the measurement limit. The warpage of the 
3-inch to 6-inch SOI wafers was 50 um or less. 
Figure5 shows a cross-sectional TEM 
lattice image of the bonded bare silicon 
substrate and the oxidized silicon substrate 
bonded by the above method. This image 
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Fig. 6—n-channel MOSFET characteristics. 


is of the interface between the bare silicon 
and oxidized film. This image shows that no 
crystal defects occurred at this interface, 
demonstrating that a SOI substrate free of 
defects up to the interface can be obtained 
using this bonding method. 


3. 64-Kbit SOI-DRAM 

Figure 6 shows that the characteristics 
of a MOSFET made on a bonded SOI wafer 
are almost the same as those of a MOSFET 
made on a bulk silicon wafer. These character- 
istics were satisfactory enough to proceed 
with the manufacture for LSI devices. 

SOI wafers having active substrates 5 um 
to 20pum thick were made using a pair of 
conventional p-type wafers. A 64-Kbit DRAM 
was fabricated using the 3-um rule from these 
wafers. The resulting SOI-DRAM has operated 
with all bits. Figure 7 shows the image and 
a cross-sectional SEM lattice image. The mask 
patterns and process conditions were the same 
as those for conventional wafers. No problems 
occurred throughout the entire process. The 
DRAM was selected for fabrication because 
it is very sensitive to crystallinity, thus providing 
a good means for evaluating the technology. 
If it is possible to make the DRAM, other 
devices can also be easily made using this tech- 
nology. 
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Fig. 7—64-Kbit SOI-DRAM. 


SER (normalized) 


0 5 10 15 20 
Active substrate thickness (4”m) 


Fig. 8—Substrate thickness dependence of soft errors of 
64-Kbit SOIL-DRAM. 


The access time is about ten percent faster 
than conventional wafers. This may result 
from the difference in the substrate bias 
dependence of the threshold voltage (Vi) 
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between a transistor on a conventional wafer 
and a transistor on a SOI wafer. The SOI- 
MOSFET has a_— smaller substrate bias 
dependence than the conventional MOSFET. 
Therefore, the value of Vy, during operation 
of the MOSFET in the DRAM is smaller for 
the SOI-DRAM, thus reducing the access time. 

When the bit number distribution of the 
data retention time for multiple DRAM chips 
was checked, the distribution ratio for less 
than one second was found to be less than 
0.5 percent. This means that the bonded 
substrate has a quality high enough to produce 
SOI-LSI devices. 

It is very important to reduce the soft 
errors in the LSI memory. 

It is said that most soft errors are caused 
by alpha-particles resulting from the materials 
used in the LSI chip. To check the dependency 
of soft errors on the thickness of the active 
substrate, alpha-particles generated from 
americium were applied to the SOIJ-DRAM. 
Figure 8 shows the results. The number of 
soft errors increases as the thickness of the 
substrate increases. There are 1/7 the soft 
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Fig. 9—Substrate bias dependence of soft errors of 
64-Kbit DRAM. 
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Fig. 10—Schematic cross-section of a thyristor. 


errors of a conventional DRAM when the 
substrate thickness is 5 wm. The number of 
soft errors is about the same as conventional 
wafers when the thickness is 20 um. (The 
range of movement of the alpha-particles 
generated from americium in _ the silicon 
substrate is up to 20um.) Thus, soft errors 
are reduced when the SOI substrate is 5 um to 
20 um thick, and the reliability of LSI is also 
improved. 

It was also confirmed that the rate at which 
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soft errors occur can be varied by applying 
a bias voltage to the base substrate. Figure 9 
shows the relationship between soft errors 
and the base substrate voltage for three different 
thicknesses. When a negative bias voltage is 
applied, the soft errors increase. When a positive 
bias voltage is applied, the soft errors decrease. 
However, a high positive voltage cannot be 
applied to reduce the soft errors, because 
a back channel occurs near the inter-layer 
insulator of the active substrate. For the p-type 
SOI substrate, a negative bias voltage is generally 
applied to prevent the back channel from 
occurring. However, a nagative bias voltage 
increases the soft errors. 

This dependence is assumed to occur 
because the band of the active substrate near 
the inter-layer insulator bends slightly due 
to the bias voltage of the base substrate. 
Although this band is only slight, it is considered 
effective in attracting electrons induced by 
the alpha-particles toward the insulator, varying 
the funneling length, and slightly varying the 
gate threshold voltage of the transistor. 


4. Complete isolation 

Highly integrated CMOS devices suffer 
from the latchup problem. This is because 
complete isolation is extremely difficult to 
achieve using a bulk silicon wafer. However, 
complete isolation is enabled when the SOI 
wafer is used. Therefore, the authors believe 
that a latchup-free CMOS device can be made. 

To verify this, the authors studied the 
thyristor structure schematically represented 
in Fig. 10. The p-well and n-well source and 
drain of this CMOS device are also schematically 
shown. This circuit was checked for the 
occurence of the latchup phenomenon. 
Figure 11 shows the results. It was found 
that no latchup would occur when isolation 
was complete. However, if isolation was in- 
complete, that is, if the trench did not reach 
the inter-layer insulator, latchup was observed. 
In this case, several discontinuous points were 
observed in the voltage-current characteristics 
as the circuit current increased. This is assumed 
to be caused by the internal resistance of the 
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Fig, 11—Latchup characteristics of thyristor in Fig. 10 
(distance from the trench bottom to isolation 
layers: a) 0 um b) 0.1 umc) 0.5 um.) 


thyristor, because it is similar to the second 
breakdown observed in the power transistor. 


5. New SOI devices 

It is important to discuss the potential 
of high-density, high-speed SOI devices. An 
example of a potential SOI device is a BiCMOS 
in which the bipolar transistor and MOSFET 
are combined in the same chip. 

It is possible to make conventional vertical 
bipolar transistors on the bonded substrate, 
and obtain static characteristics of these 
transistors identical to those of conventional 
transistors. Therefore, it is also possible to 
make a SOI bipolar device. The bipolar 
transistor and conventional MOSFET can be 
made on the same SOI wafer using the conven- 
tional method. However, the thickness of the 
SOI stustrate must meet the requirements of 
the bipolar transistor, which sacrifices the 
performance of the MOSFET. That is, the 
substrate thickness must not reduce the stray 
capacitance of the MOSFET. To avoid this 
problem, a lateral bipolar transistor capable 
of being made on a thin active substrate! has 
recently been studied. However, it is difficult 
to make a thin base layer for this transistor. 
If a thin base layer is used, then the parasitic 
capacitance of the MOSFET causes a problem. 
This is because a channel is formed near the 
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Fig. 12—Schematic cross-section of collector uses 
accemulation layer (CAL) transistor. 
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Fig. 13—Schematic cross-section of collector uses 
inversion layer (CIL) transistor. 


inter-layer insulator of the base layer having the 
emitter as the source, collector as the drain, and 
base substrate as the gate electrode. 

Therefore, the authors have proposed two 
types of bipolar transistors that can be made 
on a thin active substrate as new alternatives 
for SOI substrate utilization, and have proved 
their satisfactory operation. These transistors 
use field-induced layers due to the electric field 
of the base substrate as the collector. One type 
of collector uses an accumulation layer (CAL) 
and the other uses an inversion layer (CIL). 
Figures 12 and 13 show their structures. These 
transistors were fabricated from 2-um_ thick 
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Fig. 14—Gummel plot of CAL transistor. 


n-type and p-type SOI wafers using epitaxial 
wafers. 
The fabrication processes are as follows: 
1) CAL 
i) n-type active substrate 2 um thick 
ii) Field oxide formation 
iii) Collector n* region formation 
iv) Annealing 
v) Base layer formation 
2). GIL 
i) p-type active substrate 2 wm thick 
ii) Field oxide formation 
iii) Base layer formation 
iv) Collector n* region formation 
v) Annealing 
CAL and CIL 
vi) Formation of diffusion region for base 
electrode 
vii) SiO, deposition 
viii) Emitter hole formation 
ix) Polysilicon deposition 
xX) Emitter formation 
xi) Emitter/base/collector 
formation 


3 


— 


contact hole 
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Fig. 15—Gummel plot of CIL transistor. 


xii) Al wiring 

The buried n* layer used in conventional 
transistors is not used for the CAL. Therefore, 
the CAL is thinner only by this buried n* layer. 
For the CIL, the collector diffusion region 
is also not made except for the collector n* 
region. Since the collector electrode is made 
using the inversion layer, the CIL is made on 
an even thinner substrate than the CAL. 

Figures 14 and 15 show the measured 
emitter current and base current characteristics 
as a function of the base voltage for various 
base substrate bias voltages. When a positive 
bias voltage is applied, the base current (Jp) 
decreases and the emitter current (/p ) increases. 
This bias voltage causes the base current to 
decrease in both the CAL and CIL. The emitter 
current of the CAL increases in the high-current 
region, but it increases uniformly in the CIL. 
Figure 16 shows the relationship between 
the CAL base voltage and the current gain factor 
(hrz). This gain factor has been improved 
significantly in the low base voltage region. 
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Fig. 16—Ag¢e/ Vp characteristics of CAL transistor. 


voltage region. 

When a heavily doped region and a lightly 
doped region were made on a collector using 
the conventional vertical bipolar transistors, 
the base current is lower than when only a 
lightly doped region was made!”, This suggests 
that the dependence of the CAL and CIL 
base current on the base substrate voltage is 
because the n* layer is formed on the collector. 
Similarly, the dependence of the CAL emitter 
current on the base substrate voltage can be 
explained by the reduced resistance of the 
collector layer as a result of formation of the 
n* layer. The dependence of the CIL emitter 
current on the base substrate voltage must 
consider the change in values that were normally 
fixed during operation, such as the change 
in thickness of the base layer, in addition to 
the formation of the collector n* layer. 

These transistors were found to operate 
satisfactorily, which indicates new alternatives 
for SOI devices. They also showed that the 
crystallinity of the bonding plane of the SOI 
substrate was not damaged. 


416 


6. Conclusion 

To demonstrate that the bonded substrate 
can be applied to devices, the authors success- 
fully fabricated 64-Kbit DRAMs which require 
high-quality wafers. Using this sample device, it 
was possible to write to and read and 64 Kbits. 
Using these chips to obtain the relationship 
between soft errors and substrate thickness, 
we found that the soft errors were 1/7 that 
of conventional chips when the substrate thick- 
ness was 5 yum. The soft error rate is about 
the same as for conventional chips when the 
thickness was 20 um. It was also shown that 
soft errors depend on the bias voltage of the 
base substrate. 

It has been shown that the latchup problem 
in CMOS devices can easily be solved by using 
the bonded SOI substrate. It has also been 
shown that complete isolation is required 
for a latchup-free CMOS device. 

We studied the potential of SOI-BiCMOS 
for future high-density, high-speed SOI devices. 
As new alternatives, the authors proposed 
two types of bipolar transistors having a new 
structure, and proved their satisfactory opera- 
tion. One uses the accumulation layer as the 
collector (CAL) and the other uses the inversion 
layer (CIL). These bipolar transistors can be 
made on this active substrate which suggests that 
SOI-BiCMOS can be realized without trading off 
the characteristics of the SOI-MOSFET. 

Thus, the bonded substrate has great 
potential for use in the manufacturing of SOI- 
LSI devices. 
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Overview of Mask Technology 
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Mask-making technology plays an important role in generating the patterns for devices used 
in semiconductor fabrication. Fujitsu recognizes the importance of this technology and has 
developed it from the early days of semiconductor fabrication. 

This paper describes the mask-making technologies currently being used: data processing, 
exposure, process and inspection technology. This is followed by a discussion of the future 


trends in mask-making technology. 


1. Introduction 

Fujitsu. established its Mask Technology 
Division 25 years ago, in 1963, marking it prob- 
ably the third oldest division in the world of 
semiconductor manufactures after IBM and TI. 
This is because Fujitsu was quick to recognize 
the importance of microlithography in the de- 
velopment and fabrication of semiconductor 
devices. 

To realize Fujitsu’s objective of completely 
fabricating semiconductor devices in-house, the 
development of the technology for fabricating 
the masks started with all mask-making proc- 
esses, including materials, artwork (pattern 
generation), master masks and working masks, 
being developed simultaneously. 

This paper first describes the history of mask 
technology. Then the early stages of reserch and 
development up to the current state of the basis 
technology are described. This paper lists the 
subsequent technological innovations in a 
chronological table and provides a_supple- 
mentary description of the major technologies, 
comparing them to the corresponding IC changes. 
Details of current mask technology and future 
trends are described last. 


2. History 


Fujitsu started plotting artwork (i.e. pattern 
generation) 150 times larger than the actual size 
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on kent paper using an X-Y coordinate graph 
machine. In 1966, artwork was plotted 250 
times the actual size on a peel coat film to 
improve the pattern accuracy. In 1968, we 
developed an automatic plotting machine that 
drew the artwork 100 times the actual size on a 
film to handle the higher integration of ICs. At 
the same time, we developed computer aided 
design (CAD) for automatic plotting. In 1971, 
we installed a Pattern Generator® that exposed 
patterns ten times their actual size on reticle 
blanks to improve the pattern accuracy. We also 
installed a FACOM 230-50 as the host computer 
to fully utilize the CAD and established the 
groundwork for data processing technology. As 
the development of ICs progressed from 1 6-Kbit 
DRAM to 64-Kbit DRAM, the patterning 
accuracy had to be improved. We started in- 
vestigating the installation of a electron beam 
(EB) exposure system in 1978 and began reticle 
patterning using our EB system in 1981. Sub- 
sequently, we have generated patterns for VLSIs 
while continually upgrading the performance of 
our system to this day. 

We first manufactured the master mask by 
printing and laying out a 30x pattern on a 
1 200-mm? film. We installed a Photo Repeater® 
in 1967 to manufacture the mask for a two-inch 
wafer. We also installed a Photo Repeater® for 
the Cr hard plate, the first in Japan, in 1968. 
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The only device manufactuers that have achieved 
in-house development of mask materials such as 
Cr hard plates are IBM, TI, and Fujitsu. During 
this period, the emulsion (Em) master mask was 
continuously mass produced. Around 1978, we 
entirely changed the master mask to the hard 
(Hd) master mask. In 1983, we installed the 
fabrication technology of direct lighography 
master masks by EB in the production line. 

The first fabricated the working mask by 
printing the above-mentioned 30 x film master 
on Em blanks using a 1/30 reduction camera. 
In 1965, we developed contact printing tech- 
nology to fabricate a working mask from the 
master mask, and put it to practical use. 
We first fabricated this working mask from 
a 35-mm? mask corresponding to a 1l-inch 
wafer. 

Then, we began to fabricate large masks 
for large-diameter wafers (see Fig. 1). In 1974, 
we developed two innovative and _ original 
technologies to reduce the cost and improve 
the quality of the working mask. They were 
the mask anti over contact surface finishing 
technology and a soft contact printer. In ad- 
dition, we investigated synthetic quartz 
materials, which will be described later. We 
currently provide a 175-mm diameter mask. 
The mask factory is installed adjacent to the 
wafer factory because it supplies the wafer 
process with masks. 

One of the major results of mask technology 
applications was the practical use of the direct 
wafer pattern exposure’ technology we 
developed, based on the master mask production 
technology. This stepper exposure technology 
was first used in 1977. This technology could 
be realized only after being integrated with 
the reticle manufacturing technology. Fujitsu 
was the first company to put these technologies 
to practical use. 

Since 1969, we have placed great emphasis 
on the development and application of CAD 
for design patterns. At this time, with the 
development of the mass production VLSIs 
and gate arrays with quick turn around time 
(QTAT), the Data Processing Department 
supports the Design Department using mask 
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a) lst generation 
mask 35 mm 


b) 4th generation (present) 
mask ¢ 175 mm 


Fig. 1—Mask size comparison. 


CAD technology. Now, submicron patterns 
are also becoming increasingly widespread. 
Mask technology is the bridging technology 
of accurately forming the device required 
by the Design Department of Wafer Processing. 
Therefore, mask technology plays an important 
role by supporting the Design Department 
in pattern formation and by providing the 
Process Department with reticle masks. 


3. Development and expansion of mask 

element technology 

The chronological table (see Fig. 2) lists the 
development and expansion of mask element 
technology. Figure3 shows the _ transition 
of the manufacturing process roughly classified 
by generation. This section provides supple- 
mentary background information to _ the 
development of the manufacturing processes. 


3.1 Early technology — working mask technology 

Fujitsu’s early efforts aimed to achieve 
a mask overlay accuracy of | um for the first 
generation (1963 to 1966). At that time, such 
a high mask overlay accuracy was unheard of. 
To maintain this accuracy, it was necessary to 
minimize the layout error. Therefore, we 
planned to manufacture the mask by reducing 
it in a single step at high magnification. We 
designed and manufactured a high-magnifica- 
tion reduction camera, a 1/30 vertical type. 
For the layout, we developed a specially 
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designed printing machine by combining our 
numerical control machine (FANUC Prototype 
Model-120) with a photoengraving machine 
available on the market. As a result, we were 
able to provide a 35-mm square mask. In 1965, 
we started development of the following tech- 
nologies. 

One technical development was to realize 
working mask of metal on glass (Cr mask). 
This led to the current mask technology. At 
that time, there were two major problems 
with the working mask. The life span was 
short and high-contrast patterns having sharp 
shapes could not be formed. The life span 
of the working mask was short because it 
was easily scratched when it was used in the 
wafer process. High-contrast patterns having 
sharp shapes were difficult to form because, 
due to the low sensitivity of the photo resist, 


Year (19xx) | 63 64 65 66 67 68 69 


MB400 MOS 1-K DRAM 


Device 4-K DRAM 
BIP ECL 100-gate 


there was little latitude in the exposure, and 
interference frings were generated on the mask 
by excess light permeating when the mask was 
exposed to intense light. 

Another technical development was the 
mask contact print system. This system provided 
the foundation for mass producing high-quality 
and low-cost working masks in the third genera- 
tion, as described later. 


3.2 Establishment of basic mask-making tech- 

nology 

When the development of IC (CSL) for 
electronic switching systems began, we 
developed mask-making technology for the 
CSL. The key feature in this second generation 
was the step & repeat camera (i.e. a Photo 
Repeater® ) installed in 1967. Later, installation 
of a Photo Repeater® having several 


77 78 79 80 81 82 83 84 85 86 87 88 89 
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Fig. 2—Trend in mask engineering technology. 
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Fig. 3—Trend of mask making technology flow. 


modifications in the performance helped 
establish the mask technology. (The second 
generation started at this time.) Installation of 
the Pattern Generator® in 1971 for IC 
development of the MOS 1-Kbit DRAM and 
ECL 100-gate array determined the direction of 
the reticle making technology. 

Although supplementary, we _ also 
dependently developed important technology 
for the repair technique required to harden 
the mask material for Cr. This technique 
eliminates pattern defects caused by contamina- 
tion and defects in treated resist materials 
to ensure a 100-percent defect-free reticle. 
The repair technique was later improved from 
a laser system to the focused ion beam system, 
which is a current basic technique. 


in- 
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3.3 Mask-making technology in growth period 

This third generation (1974 to 1983) oc- 
curred when ICs became LSIs, that is when 
MOS devices progressed from 4-Kbit to 16-Kbit 
DRAMs and bipolar devices started being 
developed with ECL 400 gates. This period was 
a major turning point in the quality and ac- 
curacy of masks. To achieve fine patterns, 
to reduce the defect density, and to improve 
the total MTF, sodalime glass (blue and white 
blanks) which was used for hard (Cr) and soft 
(Em) glass blanks was changed to synthetic 
quartz glass, which is chemically and physically 
stable. 

At that time, the cost of the synthetic 
quartz glass was about 25 times that of sodalime 
glass. However, the features of synthetic quartz 


glass, viz., optical permeability, abrasion 
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resistance, resistance to chemicals, and low 
thermal expansion, solved the problems as- 
sociated with thermal expansion that occurred 
in the wafer process when sodalime glass was 
used and solved the process instability. Further- 
more, we obtained other improvements, such 
as cost reduction, because the synthetic quartz 
could be efficiently reused, resulting in the cost 
being less than that of sodalime glass. 

We installed automatic surface inspection 
technology at the same time when the blanks 
pinhole (Ph) inspection machine was developed. 
The surface inspection was then changed from 
sensory inspection to machine inspection. 
We also developed and installed a clean room 
to improve accuracy and quality. This tech- 
nology to maintain local cleanliness and 
maintain the ambient temperature to within 
0.1 °C played an important role in stabilizing 
the alignment accuracy in the mask making and 
wafer processes. 

At the same time, we also developed laser 
interferometric measurement technology to 
control the accuracy, and installed an inter- 
ferometer in 1972. The interferometer is 
required to thoroughly control the accuracy of 
both products and machines when mass 
producing masks. Higher integration of LSls, 
conversion of mass produced MOS _ from 
256-Kbit DRAM to1-Mbit DRAM, development 
of the 4-Mbit and 16-Mbit DRAM VLSI, and 
large-scale, ultrashort delivery time of the gate 
array product have resulted in another major 
turning point for mask-making technology. The 
next chapter describes the technology of the 
fourth generation. 


4. Current mask technology 

Incorporating the above changes, the Mask 
Engineering Department supplies the Wafer 
Processing Department with reticles and masks. 
Figure 4 shows the current functional structure 
of the Mask Department. The Kawasaki Works 
is mainly responsible for the Mask Department, 
and is also mainly responsible for technical 
development, and making prototypes. It also 
fabricates the reticles and partially fabricates 
the masks for mass-produced LSIs including 
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Material engineering section 
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Iwate works 


Production section 


Fig. 4—Function structure of mask department. 


advanced and prototype LSIs. 

As described above, a mask shop is installed 
at each wafer process factory and produces 
and supplies the reticles and masks for the 
LSIs fabricated at the works. This is an im- 
portant advantage to provide customers with 
devices in a QTAT that other companies cannot 
provide. At present, these shops develop the 
reticle and mask technologies and fabricate 
them. The processes flow is outlined below. 


4.1 Data processing 

To fabricate LSIs according to the CAD 
data (design data) generated by the LSI designer, 
the first process required is data processing. 
Data processing converts the design data into 
manufacturing data for the EB exposure 
machine (exposure data) and inspection machine 
(inspection data). These processes depend 
on conversion technology using super computers 
and on verification technology that verifies 
the converted data. 

Before describing these processes, the 
flow of the data process is described in Fig. 5. 
First, the design data is transferred domestically 
and internationally to super computers through 
communication lines. Then, specific types of 
LSI manufacturing machine data (process data) 
are created, such as the alignment pattern 
data used in the stepper. We convert both types 
of data into exposure data using the conversion 
software we developed. The converted data 
is then verified and transferred to the most 
suitable factory to fabricate the reticles and 
masks. 
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Fig. 5—Data processing flow. 


The various computers in each factory 
form a network using the Flexible System 
Link (FSL)” FLS enables inter-system com- 
munication to use high-speed transmission of 
up to 33 Mbit/s and channel connection through 
highspeed and highly reliable communication 
lines consisting of optical fiber cables. Our 
network is effective for LSI fabrication in 
each remote factory. This advanced network 
between the design center and mask shop 
has been proposed and is being implemented 
in the USA. We have already been able to 
reduce the delivery time of ASICs using this 
network. 

4.1.1 Data conversion technology 

In this subsection, we describe the data 
conversion technology that forms the nucleus 
of data processing. Data conversion technology 
enables the conversion process that generates 
exposure data from design data and enables 
the pattern generation process that generates 
process pattern data for the equipment used 
for wafer fabrication (e.g. stepper). As the 
complexity of wafer fabrication increases, 
process pattern data is also becoming more 
complex. Process pattern data and design 
data become more complex as the required 
device pattern becomes increasingly complex. 
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Table 1. Data convert processing result 


CPU time 
/(FACOM M 780) 


Ratio Ratio 
—— 


Conversion Output data 


softwear 


Non-hierarchical 


structured 234 20.1 
softwear 

Hierarchical | 

structured 1.0 1.0 
softwear | 


There are results of memory device (about 50 million 
patterns). 


To resolve these problems, we _ have 
developed an original data conversion system 
called Fujitsu Artwork Data (FAD) processing 
on a FACOM M-780 main frame computer. 
Although the FAD system enables conversion 
of large amounts of data, the dramatic increase 
of the volume of design data in recent years 
requires a more efficient system. Therefore, 
we have introduced a_ hierarchical structure 
into exposure data to manage the exposure data 
more easily, and a new FAD system which 
enables both the conversion time and the 
volume of exposure data to be reduced. 

Table 1 lists the results of the conversion 
process for the data of one 24-Mbit DRAM 
mask. This mask data includes about fifty 
million patterns (5x 107). Table 1 shows 
that the productivity of the new FAD system 
is 23 times higher than the conventional system 
and enables a 5-percent reduction in the volume 
of exposure data compared to the conventional 
system. The new FAD system has had a signifi- 
cant impact on high-density memory devices in 
which the same pattern is repeated frequently, 
and has contributed to the development and 
fabrication of these devices. 

Data compaction technology is essential 
for the next generation of high-density devices. 
Compaction techniques reduce not only the 
load on the data conversion process but also 
the data transfer between works. Currently, 
the new FAD system enables the handling 
of large amounts and various types of data 
from both domestic and overseas customers 
via a communication network and satellite 
system. 
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4.1.2 Data verification technology 

Data verification is another very important 
aspect of data processing. It is necessary to 
compare design data with exposure data and 
to check the exposure data based on the design 
rules. Verification processes include’ the 
comparison of reticle to design data, CD 
measuring, and checking the compliance with 
design rules using both software and the 
specified hardware. As the feature size of devices 
becomes smaller, a verification accuracy of 
0.01 wm or less will be required for the next 
generation of devices. This new technology 
is currently being developed. 


4.2 EB exposure system 

With the advances of optical exposure 
technology such as excimer steppers in the 
wafer process, the 16-Mbit DRAM VLSI will 
soon be able to be patterned using the stepper. 
However, design rules as severe as 0.5 wm or 
more will require highly accurate reticles for 
stable production of LSIs. For 5 x reticles, 
each layer must maintain an overall error in 
the dimensions and a positioning accuracy 
of 0.1 wm or less and an inter-layer overlay 
accuracy of 0.1 um or less. 

This section describes the highly accurate 
reticle patterning EB exposure system, named 
NOWEL” which has been developed to meet 
these requirements. Table 2 lists the system 
specifications and Fig.6 is a block diagram 
of the system. 

NOWEL has the following three major 
characteristics: 

1) Vertical landing deflection system, 
2) double-exposure method A/B mode, and 
3) repeated data compression. 

The following two subsections describe 
its characteristics in detail. 

4.2.1 Vertical landing deflection system 

The NOWEL deflection system consists 
of a two-state deflector, that is, an electro- 
magnetic deflector that deflects the 5-mm? 
main field and an electrostatic octpole deflector 
for deflecting a field of 100 um. The electro- 
magnetic deflector is composed of three saddle 
coils in series for both the X-axis and the Y-axis. 


424 


Table 2. Specifications of NOWEL system 


Item Specification 


Step-and-repeat moving stage 
Variable-shaped beam (max 4 um?) 
Dual deflection system 

Vector scan 


Writing method 


Beam voltage 20 kV as a 
Current density | 20 A/cm?, LaBeg gun 
Clock rate Max 5 MHz = = 
max $7 in 
Substrate Automatic loading. 5 in? 
CD control <0.1 um (3 0) 


Butting accuracy | <0.1 ym (main field, 5 mm?) 


< 
Overlay accuracy S<0O.1 um (3 0) 
< 


Edge roughness 


0.05 um 


scale 
data 
memory 


Pattern 
generator 


Position 
detector 


Fig. 6—Block diagram of NOWEL system. 


Vertical landing is achieved by combining 
a short-working distance lens (M=0.86) and 
the electromagnetic deflector. Simulation of 
the deflection system is based on a third-order 
aberration theory. For a 5-mm? deflection 
area, vertical landing, and low deflection aberra- 
tion, we obtained five third-order aberration 
coefficients and a primary chromatic aberration 
coefficient. The results of numerical calculation 
of an aberration that cannot be dynamically 
corrected and their preset conditions are listed 
in Table 3. 

Figure 7 shows the differential backscattered 
electron signal at the center and at the upper 
right corner of the main field. These correspond 
to a l-um wide beam. The X component of 
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b) The upper right-hand corner of main field 


Fig. 7—Differentiated backscatter electron signals from calibration grid, 


Table 3. Conditions of deflection system 


Item Specification 
Beam voltage 20 keV 
Setting Energy distribution 2 eV 
Beam half angle 7 m-rad 
Coma 0.027 um 
: Spherical 0.097 um 
Aberration n : 
Transverse chromatic 0.100 um 
Axial chromatic 0.070 um 
a Root mean square 0.158 um 
Telecentricity ‘ 
Beam landing angle 0.4 m-rad 


the aberration is largest at the upper right 
corner. Expansion of the beam due to this 
aberration is about 0.2 um. This value is worse 
than the results of numerical calculation and is 
assumed to be due to an error during coil 
manufacturing and assembly. 

Figure 8 shows the change of the field 
butting error at different substrate heights. 
The difference in the reading of the vernier 
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Fig. 8—Main field butting error vs substrate height 
variation, z position. 


between the adjustment reference level and 
the level 80 um below the adjustment reference 
level is 0.4 um. That is, the beam landing angle 
at the corner of the main field is about 
2.5 m-rad. Since the variation in the height 
of the actual substrate is about 10 ym, the 
effect is only 0.05 um. 
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a 


a) A mode 


(Half dosage) t f 


b) B mode 


| One rectangular 


| | | | pattern 


c) A mode plus B mode 


Fig. 9—Basic concept of the A- or B-modes. 


4.2.2 Double-exposure method “‘A/B mode” 

Figure 9 shows the basic concept of tHe A/B 
mode, which uses a rectangular pattern. The 
pattern is continuously exposed twice and is 
subject to half the dosage each time it is 
exposed. The numbers in a) and b) of Fig. 9 
indicate the sequence of the exposure. The 
initial exposure equally divides the pattern 
into m x n shots. This called the A mode. 

The B mode shifts the pattern by half the 
shot size in the A mode and exposes it. The 
peripheral section has a shot size half that 
for the A mode for the X-axis, the Y-axis, 
or both axes. As a result, the edges of the A 
and B shots cancel each other out. This series 
of operations is automatically performed by 
the logic circuit of the pattern generation 
unit. Figure 10 shows the resist pattern exposed 
in each mode. The pattern edge roughness 
was improved to less than 0.05 um using the 
A/B mode. 

4.2.3 Repetitive data compression 

Figure 11 shows how the NOWEL hierarchi- 
cal data is stored in the memory. The memory 
cell area consists of memory cell blocks 
composed of small cell units. Therefore, for 
repetitive data such as a memory cell, the cell 
unit pattern data and the repetition information 
can be specified in the LSI. 

The repetition information include the 
coordinates of the starting point, array pitch 
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a) A mode 


b) B mode 


c) A mode plus B mode 
Fig. 10—Scanning electron microscope 


(SEM) micrographs. 


of the repetition and number of arrays. There 
are two hierarchical layers specified in each 
repetition, one in the block unit and the other 
in the LSI block. For the pattern data, only 
the basic cell and peripheral circuit are required. 
The subfield matches the cell unit size. 

The main deflector develops the unit accord- 
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Fig. 11 —Hierarchical structure of pattern data. 


ing to the repetition information in the block. 
This is called the Main Defection Matrix Mode. 
We developed a 20-bit DAC for accurate 
positioning, especially when specifying the 
position in this mode (0.005-um LSB for a 
5-mm deflection). The block in the cell area is 
configured by moving the stage. 

The above data compression technology 
can ignore the amount of cell data. Since only 
the peripheral circuit is considered, the total 
data compression ratio for a 16-Mbit DRAM 
was reduced to about one tenth the earlier 
ratio. 

4.2.4 Overlay accuracy 

Figure 12 shows the overlay accuracy for 
three test reticles, including the repeatability 
of the inter-recticle position. We measured 
the position of the 11-x-11 crossed patterns 
configured in a 10.6-mm pitch using a Nikon 2] 
and observed how the two reticles overlapped. 
The error was less than 0.1 wm. 

Of the three features of the NOWEL, A/B 
mode and repetitive data compression are 
completely effective. However, the vertical 
landing deflection system has a 0.2-um aberra- 
tion in the 5-mm field which resulted in a value 
worse than the target value. Therefore, in actual 
operation, we suppress the aberration of a single 
reticle and the field butting error by using the 
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Fig. 12—Plot of overlay accuracy derived from three 
patterns. 


field in an area smaller than 2 mm? square. 
We can obtain a dimensional accuracy of 
0.07 um (3 @). 


4.3 Process technology 
This section describes the automation 

of reticle processing and dry etching. 

4.3.1 Automation technology 

The reticle processor currently being used 
is an in-line processor that automatically 
performs the processes from resist developing 
to chrome film etching. 

This processor has the following two 
features. 

1) Mixed simultaneous processing of positive 
and negative blanks by automatic recogni- 
tion 

2) Mechanical transfer system to minimize 
contamination caused by the carriers 
The first feature, mixed processing of 

positive and negative blanks, enables both 

positive and negative resists to be used so that 
the processing capability of the EB exposure 
system (NOWEL) can be fully utilized. Proper 
use of the resists minimizes the exposure area 
required to form the pattern and improves 
the throughput. This processor has lines that 
can handle both types of resists and performs 
this processing by providing MPUs that control 
each line’s equipment and a CPU that controls 
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the overall system. 

The second feature, the mechanical transfer 
system, achieves cleanliness during the process. 
This is important because even one particle of 
dust deposited on the reticle during processing 
will make it defective. The problem of con- 
tamination generated from the mechanical 
transfer system remains unsolved, but we 
eliminated this problem by completely 
separating the transfer system mechanism 
section and the blank carrier area and by instal- 
ling the unit in a laminar down-flow thermal 
clean booth. By docking this machine to the 
exposure machine transfer line, the processes 
from exposure to chrome etching can be 
performed completely automatically. This has 
enabled defect-free reticles to be manufactured 
at high yield. 

4.3.3 Dry etching technology 

To cope with fine patterns, we began 
development of dry etching technology in 1979. 
At present, we process all reticles for VLSI 
using dry etching. There are two _ technical 
problems presented by dry etching of the 
reticle. One is that the resist and Cr selectivity 
is low because the Cr etching rate is low 
(and conversely, the etching rate of the EB 
resist is high). The other problem is that the 
reactor becomes large, which makes it difficult 
to obtain uniformity of the plasma because 
the blanks are square. We improved the selectivi- 
ty by increasing the pressure to the point where 
a practical throughput could be obtained. 
This is because the selectivity depends on the 
pressure in the reactor and RF plasma power 
supplied to the plasma. We also changed the 
discharge system to a cathode-coupled plane 
parallel electrode and reduced the RF plasma 
power. To improve the plasma uniformity, 
the inside of the reactor and the outer vacuum 
tube system were constructed symmetrically. 
This is because the uniformity of the RF dis- 
charge plasma depends on the symmetry of 
EM wave propagation within the machine. 
By applying the above technologies, we installed 
dry etching in which CD control is simple and 
in which the sharpness of the pattern edge is 
superior for reticle processing. 
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Table 4. Inspection specifications of high grade 5 x 
reticle 


Inspection item Specification 


Defect size 


<1.0 um 


CD accuracy | +0.1 um 


Runout orthogonality 3 ppm 


4.4 Inspection technology 

The inspection checks that the reticles are 
manufactured accurately to the _ shape, 
dimensions, and position specified by the design 
information. 

The current reticle inspection is explained 
below. 

1) Defect inspection 

Detects any pattern abnormality on the 
reticle, using a die-to-die inspection or data 
base inspection machine. 

2) Data base inspection® 

Compares the inspection data with reticle 
pattern, and defects common to all chips 
(defects caused by exposure data). 

3) CD inspection 

Measures the specified dimensions of the 
patterns. Normally, it checks the minimum 
line width. 

4) Metrology inspection 

Uses a laser interferometric measuring 
machine to measure the length of the reticle 
pattern from the absolute coordinate and checks 
for orthogonality errors. 

5) Final inspection 

Inspects the front and rear sides of the 
reticle for dust and scratches. 
6) Defect repair 

Repairs all defects detected by the defect 
inspection. 

The inspection specification is determined 
by the reticle magnification (5 x and 10x) 
and the required accuracy of the device type. 
Table 4 shows the 5 x reticle inspection specifi- 
cation for an advanced LSI. Especially for 
patterns 1 um or less on LSI devices (5 um on 
5 x reticles), even a pattern edge defect only 
about ten percent the size of the pattern width 
on the reticle can become a catastrophic defect 
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because the pattern is close to the maximum 
resolution of the reduction stepper in the wafer 
process. Therefore, for reticles having fine 
patterns such as for a 4-Mbit DRAM, it is 
necessary to detect defects of 0.5 wm or less 
on the reticle. We can inspect the reticles more 
accurately and efficiently by laying out a 
dedicated inspection measurement pattern in 
the reticle independent of the device pattern. 

It is also necessary to improve the data base 
inspection in the future. This is because the 
reticle die-to-die inspection can detect the 
defects by comparing both patterns, but it 
cannot detect the abnormalities that repeatedly 
occur in the reticle pattern. However, data 
base inspection can detect these abnormalities 
and ensure that the pattern matches the CAD 
data of the reticle. 

The inspection items in the data base inspec- 
tion are as follows: 

1) Inspection of defects that occur at random 
in a single chip reticle 

2) Inspection of exposure data, pattern 
deformation and whether the pattern has 
fallen off 

3) Inspection of abnormalities that occur in 
the exposure system and the present condi- 
tions of the positioning coordinates. 

We created the data for the data base inspec- 
tion machine from design data in parallel with 
the exposure machine data. To process the large 
amount of data for devices such as a 4-Mbit 
DRAM, the data compression system is applied 
to reproducible memory types and has been 
put to practical use. We use different data 
processing programs to create the inspection 
data and exposure data. As a result, we can 
detect preset errors that occur when creating 
the exposure data and program errors by 
comparing the inspection data. However, the 
data base inspection also compares the image 
from the actual reticle pattern with the inspec- 
tion data which results in problems associated 
with the deterioration of accuracy, such as the 
acceptability/unacceptability of the pattern 
shapes and positioning errors caused by both 
the exposure and inspection system. Thus, 
there is still room for improvement in the 
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inspection accuracy and we are continuing 
to study the inspection methods for reticle- 
making technology. 


5. Future of mask technology 

The optical exposure technology in the 
lithography of the wafer process will continue 
to be the one to meet the diversification of ICs, 
fine patterns for higher integration, and the 
requirements for high accuracy. To resolve 
these issues, we are continuing to develop the 
technology. When addressing the increase in 
the amount of data due to high integration, 
the problem will not be fully solved if we use 
different methods to handle the increase in 
the data processing time and patterning time 
of the exposure machine. This means that we 
cannot solve this future problem only through 
the development of individual technologies. 
It is necessary to develop combinations of 
technologies. In view of these considerations, 
our plan for the future is as follows. 

We are currently developing the blanks 
required by 64-Mbit DRAMs as follows. We have 
almost completed a method that can manufac- 
ture glass resistant to the KrF excimer laser 
(non-fluorescent). The blanks require the 
development of an angular resist coating tech- 
nology to coat the entire surface with an ac- 
curacy of 5mm to achieve a sensitivity disper- 
sion of +0.01 wm, and the development of 
a Cr metalizing technology having a pattern 
fracture ratio due to cleaning of less than 
1 ppb. We are developing an ultra-clean auto- 
matic process for the manufacture of pinholeless 
blanks. For glass cleaning, we have already 
partially implemented an automatic cleaning 
technology for particles 0.1 um or less. We 
are also developing in-line sputtering processes 
and are obtaining good results. 

According to our view that data processing 
is directly related to QTAT, we will continue 
to work on completing the processing for 
QTAT, we will attempt to complete the process- 
ing in a shorter period of time by incorporating 
special processing software that exclusively 
processes characteristic patterns for each type 
into the conventional general data processing 
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software. We are also developing technology 
to change the conventional two-dimensional 
verification to three-dimensional verification by 
simulating the pattern formed on the actual 
wafer from the pattern on the reticle or from 


the design data. 
The exposure technology requires an ex- 


posure machine having a much higher resolution 
and accuracy which will result in an increase 
in the exposure data. For the exposure machine, 
we are now developing the next generation 
NOWEL that can manufacture reticles for up 
to 64-Mbit DRAMs. As described above, we 
are already compressing the exposure data 
by making the best use of the pattern features 
for each type, but we will further expand the 
range in the future. 

The reticle process department will mainly 
develop the process, with emphasis on CD 
controllability for future fine minituarization 
of the dimensions. All patterns that form the 
device are not submicron patterns. Some reticles 
have patterns as large as 100 wm or more or have 
submicron patterns mixed. It is difficult to 
manufacture these reticles within the required 
dimensional accuracy for both, although prox- 
imity effect correction at exposure can resolve 
this problem. The multilayer resist process can 
also resolve the problem. Therefore, we are 
setting our development goals to a dimensional 
control accuracy of 0.01 um. The yield from 
reticle manufacturing is an important factor 
that affects the ASIC TAT. We believe that 
the high yield resulting from a stabilized process 
will contribute to the realization of QTAT. 

The inspection technology also requires 
high accuracy and high-speed performance. 
To improve the accuracy, we are studying 
improvements in the current optical inspection 
and an inspection machine that uses an electron 
beam” However, if the accuracy is simply raised, 
the inspection speed will be reduced at the 
current technical level. Therefore, we must 
also develop a system that automatically changes 
the inspection mode according to the contents 
of the reticle pattern. It will perform high- 
accuracy inspection where high-accuracy is 
required and will perform high-speed inspection 
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when accuracy is not required. To handle large- 
scale data in the data base inspection, we will 
compress highly efficient data and improve 
the data compression effect further in the 
memory types (repetitive patterns that are 
frequently used such as memory types, patterns 
in which memory and logic are mixed in a single 
device such as some of the logic types, and 
patterns having logic only). We will also expand 
the system so that it can be applied to patterns 
having little repeatability such as random logic 
types. 

We believe that the purpose of the inspec- 
tion is not only to evaluate the acceptability 
or unacceptability of reticles, but also to help 
promote the development of the technology 
by accurately analyzing the various types of 
inspection information and appropriately feed- 
ing them back into the exposure machine and 
process. For this purpose, we are planning to 
develop a system in which we can consistently 
provide high-quality reticles by incorporating 
the data base into the inspection results, auto- 
matically analyzing the results, and incorporat- 
ing an online system between the inspection and 
exposure systems. 


6. Conclusion 

With the advances in ICs, the mask tech- 
nology department has continued its technologi- 
cal innovations mainly in the fine-pattern 
formation technology. To cope with the 
incorporation of VLSIs in the future, the Mask 
Department will develop large-capacity data 
processing technology and quality assurance 
technology in addition to the fine-pattern 
formation technology, and will continue to 
contribute to the development of the semi- 
conductor industry. 


References 

1) Serikawa, H.: FSL. FUJITSU, (in Japanese), 38, 3, 
pp. 194-199 (1987). 

2) Hamaguchi, S., Kai, J., and Yasuda, H.: High- 
precision reticle making by electron-beam litho- 
graphy. J. Vac. Sci. Technol., B6, 1, pp. 204-208 
(1988). 

3) Awamura, D.: Reticle inspection technology to 


FUJITSU Sci. Tech. J., 24, 4, (December 1988) 


K. Yanagida et al.: Overview of Mask Technology 


compare the pattern against data. Proc. SPIE, 334, micron pattern defects on optical photo masks using 
pp. 230-237 (1982). an enhanced EL-3 electron-beam lithography tool. 
4) Simpson, R.A., and Davis, D.E.: Detecting sub- Proc. SPIE, 334, pp. 208-215 (1982). 


Kimio Yanagida 


Process Development Division 
FUJITSU LIMITED 

Bachelor of Physics 

Tokyo Science University 1964 
Specializing in Process Development 


Takao Furukawa 
Mask Engineering Dept. 
FUJITSU LIMITED 
Junior College 
of Osaka Pref. University 1962 
Specializing in MASK Engineering 
Development 


FUJITSU Sci. Tech. J., 24, 4, (December 1988) 


Takeo Kikuchi 


Mask Engineering Dept. 

FUJITSU LIMITED 

Bachelor of Mechanical Eng. 

Tokyo Science University 1984 

Specializing in MASK Engineering 
Development 


431 


UDC 621.3.049.76 


Packaging Technology for ASICs 
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The use of ASICs is rapidly progressing and expanding. ASIC packaging technologies are also 
enabling higher density, diversification, and customization. This review introduces the 
package line-up and ASIC packages currently offered by Fujitsu, then describes current tech- 


nological problems and future trends. 


1. Introduction 

The demand for ASICs is increasing in a 
wide range of fields!)"3). In response to this 
trend?)-®), ASIC packages are required to 
perform more functions than before. For this 
reason, Fujitsu has been developing new tech- 
nologies such as package structures and new 
materials. 

The main technological problems to be 
solved for ASIC packages are as follows: 

1) High pin count technology 

2) Large chip technology 

3) High-density assembly technology 
4) High-power device technology 

5) Special custom package technology. 

These technological problems are often 
interrelated. Thus, a combination of these new 
technologies will enable. ASIC functions to be 
fully utilized and will lead to the development 
of package families that can meet all user’s 
requirements and applications. The ASIC field 
is rapidly advancing and is highly competitive. 

This paper introduces the current situation 
and future trends concerning Fujitsu’s ASIC 
packaging technologies. 


2. ASIC packages 

Figure | shows the current package line-up. 
Of these groups, the types of packaging mainly 
used for ASICs are listed below with a descrip- 
tion of their features. 
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2.1 Shrink DIP (SH DIP) 

The lead pitch of the standard DIP has been 
reduced from 100 mil to 70 mil to reduce the 
package length. The row space has been also 
reduced from 600 mil to 400 mil for the 28-pin 
package and from 900 mil to 750 mil for the 
64-pin package. Shrink DIP is used in micro- 
computers and gate arrays. 


2.2 Pin Grid Array (PGA) 

PGA is suitable for a high pin count because 
koval pins are arranged in a matrix on the 
ceramic substrate (see Fig. 2). A resin seal was 
initially used, but now a seam weld is used to 
reduce costs and improve reliability. A frit seal 
is also sometimes used. When a high-power 
device is mounted, a cavity-down structure 
having an attached radiation fin is used to lower 
the thermal resistance. 

PGA has recently been used as a Surface 
Mount Device (SMD) rather than a Through 
Hole Device (THD) because the pin pitch has 
been reduced due to the higher pin count and 
the pin itself is smaller. PGA is used in micro- 
computers and gate arrays. Recently, it has also 
been used in high-speed Bi-CMOS and ET 
(ECL/TTL) devices. 


2.3 Plastic Pin Grid Array (PPGA) 

The basic structure of the PPGA is the same 
as that of the PGA. A printed wiring board is 
used for the substrate to reduce cost. PPGA has 
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Standard 


Small size 


Dual lead 


Mono lead 


Dual lead 


Quad lead 


Quad lead 
Dual lead 


Modul SMD module 


THD : Through hall device SMD _ : Surface mount device 


LCC : Leadless chip carrier LDCC : Leaded chip carrier 
SH-DIP: Shrink DIP SK-DIP: Skinny DIP 
SL-DIP : Slim DIP SIP : Single in-line package 
ZIP : Zig-zag in-line package 

ISO : International organization for standardization 


Fig. 1—Current package line-up. 


Fig. 2—Pin grid array (PGA). 


good electrical characteristics and a low package 
weight because it uses Cu wiring (35 yum thick), 
and has good matching because the substrate 
material is the same as that of the printed wiring 
board. PPGA is used in a broader range of 
applications. However, the limitations of the 
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Resin LSI chip 


Metal cap Frame 


Copper 


Au plating 


Ag paste 


Pin PWB 


a) PWB substrate type 


Al cap Wire 


Epoxy resin Bonding finger 


Stage 


b) Ceramic substrate type 


Epoxy resin 


PWB 


c) Mold type 


Fig. 3—Plastic pin grid array (PPGA) structures. 


wiring pattern and outer lead limit the maxi- 
mum number of pins in a PPGA to about 200. 
Various structures are being investigated. 
Figure 3 shows the cross sections of possible 
PPGA structures. Figures 3a) and b) are similar 
to the PGA structure, have a resin seal using the 
potting method, and c) use the transfer mold 
method for sealing. In the future, PPGA will 
mainly be used for general-purpose gate arrays. 
Figure 4 shows a PPGA. 


2.4 Small Outline Package (SOP) 
The external dimensions of this package 
conform to both EIAJ and JEDEC standards. 
Recently, thin Shrink SOP (SSOP), including 
Very SOP (VSOP) and Thin SOP (TSOP), has 
been developed for card mounting. SOP is used 
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Fig. 4—PPGA. 


Fig. 5—Small outline package (SOP). 


in microcomputers. (see Fig. 5). 


2.5 Quad Flat Package (QFP) 

The external dimensions of this package also 
conform to the JEDEC standard. Large quan- 
tities of QFPs of many types are used in gate 
arrays, custom devices, and microcomputers. 
Thin Shrink QFP (SQFP), including Very Small 
QFP (VQFP) and Thin QFP (TQFP) has been 
developed in the same way as SOP (see Fig. 6). 


2.6 Leadless Chip Carrier (LCC) 

The external dimensions conform to the 
JEDEC standard. LCC is used to obtain a highly 
reliable device. As more pins are required, solder 
connection becomes unstable due to dispersion 
in the solder pad when mounting a LCC on the 
printed wiring board. Therefore, a leaded chip 
carrier is being developed. 
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Fig. 6—Very small quad flat package (SQFP). 


Fig. 7—Plastic leaded chip carrier (PLCC). 


2.7 Plastic Leaded Chip Carrier (PLCC) 

The external dimensions conform to the 
JEDEC standard. PLCC is used in microcom- 
puters, gate arrays, and custom devices. Although 
the mounting area is small, the outer lead 
structure is a J type. Therefore, only 100 pins 
can be manufactured. Figure 7 shows a PLCC. 


3. Technology and trends in ASIC packaging 
3.1 High-pin-count Technology 

Table 1 lists the current high-pin-count 
packages. The packages have been developed to 
improve IC/LSI integration and accommodate a 
higher pin count. The pitch has been reduced 
because the body size was reduced while the 
number of pins has increased. The ceramic 
package is fundamentally different from the 
plastic package because its manufacturing 
method is different. Ceramic QFP and PGA can 
achieve a higher pin count than the plastic 
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Table 1. Current high pin count packages 


Package Pin count | Pin pitch Body size 
—_ [80 [| 080 [| 14x20 | 
Mens gee |_ 100 | nies 14 x 20 
120 0.80 28 x 28 
160 0.65 28x28 
44 L327 |: ex 166. 
PLCC | @8 | 127 | See 
rT e4 | a7 | 295%203 
135 2.54 31.8x 31.8 
PPGA 144 2.54 36.8 x 36.8 
179 2.54 39.4 x 39.4 
60 0.5 18.3 x 18.3 
[84 0.5 12.7 x 12.7 
Ceramic QFP 164 0.5 27.9 x 27.9 
180 0.4 22.5 x 22.5 
' 260 0.5 41.1x 41.1 
179 2.54 38.1 x 38.1 
PGA 208° 2.54 43.0 x 43.0 
256 2.54 50 x 50 
256 | 1.27 | 25.1 x 25.1 
— 84 | 10 23.4 x 23.4 
200 0.5 tagger 15x 15 
package. Figure8 shows the relationship 


between the number of pins and body size, 
and shows projections for the future. The 
ceramic package has technological advantages 
which enable it to achieve a higher pin count. 

The technological problems for various 
package types are described below. 


3.1.1 Reduction of inner lead pitch 
For the ceramic package, the inner lead 


pitch has been reduced, as described below. 

A sample of the LCC200 having an inner 
lead pitch of 0.2mm, has been successfully 
manufactured. As a result, the W grain size and 
size distribution were mainly adjusted to 
improve the metalized paste composition. 

The lithographic process was improved to 
enhance the precision of the screen mask and 
pattern. The precision of the green sheet was 
improved by increasing the precision of the 
facility making the green sheet and dryness of 
the green sheet. 

A two-step bonding pad structure was also 
developed. To further reduce the pitch, the two- 
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Plastic 
Ceramic 


Body size (mm~) 


TAB: Tape automated bonding 


100 200 300 400 500 600 
Number of pins 


Fig. 8—Number of pins vs. body size. 


Wire length: 2mm 


10 


Chip size (mm) 


0 
0 100 200 300 
Number of pins 


Fig. 9—Number of pins vs. chip size. 


step bonding pad must be changed to a zigzag 
structure and a multi-step bonding pad structure 
must be developed. 

The pattern precision must be enhanced by 
improving the screen print technology. Align- 
ment precision of the multilayer wiring must be 
improved by improving the green sheet lamina- 
tion technology. 

Finally, the package distortion must be 
reduced by improving the firing technique. 

For the plastic package, the inner lead pitch 
was reduced as described below. 

Figure 9 shows the relationship between the 
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Table 2. Lead frame materials 


ve ae sec)” | GIACS) | (eet/mm)|Ckgtfimm?)|(egh/me?)| 8) 
Cu 99.96 Cu 0.934 100 =| ~—s«110 35 - 6 
42 Alloy | Fe-42Ni 0.036 2.5 200 80 16 800 7 
MF202 Cu-2.08n-0.2Ni 0.37 30 190 55 11 500 11 
TAMAC 11] 6404 meal 0.80 82 135 43 12500 | >2 
SLF-3 | Cu-0.8Cr-0.12Sn 0.75 80 170 | 55 12000 10 
EFTEC64 | Cu-0.3Cr-0.25Sn-0.1Zn 0.72 75 165 | 55 12100 | >10 
MCL-1 | Cu-03Cr-0.1Zr 0.72 - 180 60 14000 | >4 
CCZ Cu-0.55Cr-0.25Zr 0.83 85 150 50 13 900 13 
EFTEC200 | Cu-0.25Ti-1.5Ni-2.0Sn-0.5Zn 0.36 35 195 65. | 12500 >5 
KLF-125 | Cu-1.25Sn-3.2Ni-0.7Si-0.3Zn 0.36 35 > 200 >60 12 500 10 
NK164 Cu-l .6Ni-0.4Si-0.4Zn 0.47 50 >170 > 55 14.000 >5 


T.E.C.: 42 Alloy = 4.3 x 10/°C, Cu Alloy = about 18 x 10/°C 


Specific gravity: 42 Alloy = 8.2, Cu Alloy = 8.9 


number of pins and chip size when the wire 
length and wire angle are taken into considera- 
tion when designing the pattern. This figure 
indicates that the QFP320 pin can be manu- 
factured and the lead pitch can be reduced to 
0.16mm when it is properly designed. The 
process limitation for the lead frame when using 
the stamping method has conventionally been 
the lead thickness for both the pattern width 
and gap”). 

Currently, the pattern width and gap are 
about 70 percent of the lead thickness. That is, 
when 42 alloy having la thickness of 0.15 mm is 
used, a pitch of about 0.2 mm can be processed. 
To further reduce the pitch, the lead thickness 
must be reduced. This will reduce the amount of 
heat conducted through the lead frame and thus 
reduce the heat dissipation characteristics of the 
entire package. 

As the lead strength also becomes insuf- 
ficient, a lead thickness of 0.15 mm and a 
0.2 mm pitch is a practical limit. Since a copper 
lead is soft and the lead strength is low, the lead 
thickness is limited to 0.2 mm for practical 
applications, and the lead pitch is limited to 
about 0.2 mm. The copper material®) has low 
electrical resistance and high heat conductivity, 
thus the lead thickness can be reduced if the 
mechanical strength is improved. The use of the 
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new materials listed in Table 2 is being investi- 
gated. 

3.1.2 Reduction of outer lead pitch 

For the ceramic package, the outer lead 
pitch has been reduced, as described below. 

Currently, a 0.4mm pitch for QFP180, a 
0.5 mm pitch for QFP260, and a 1.27 mm pitch 
are available for the Pin Grid Array (PGA). In the 
near future, a pitch of 0.25 mm to 0.32 mm for 
QFP400 to QFP600 and a 0.63 mm pitch for 
PGA400 to PGA600 will become available. 

To achieve this, the firing shrinkage must be 
controlled. Although dispersion in the firing 
shrinkage is generally about 1.0 percent, it is 
currently 0.5 percent at Fujitsu. However, this is 
insufficient for developing a_high-pin-count 
package, and the dispersion will be improved 
to 0.3 percent in the future. 

Because improvement of only the precision 
of the conventional thick film technique is in- 
sufficient for forming an outer leaded pad, the 
fine pitch pattern is achieved by combining the 
thick film technique and thin film technique. 

As the pitch is reduced, the outer lead may 
not be able to be used with the fine pitch 
pattern when conventional materials and manu- 
facturing method are used. In this case, Tape 
Automated Bonding (TAB) lead material and 
the lead bonding method will be required. 
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Table 3. Au-and Cu-wire characteristics 


a) Au-Wire 
[ Room temperature | High temperature (250 °C) 
aa aene aan Elongation (%) y eee Elongation (%) 
FA (S) [ 12.9 8.9 9.4 1.9 
FA 15 4.5 11.3 2.0 
A SR 11.2 [ 4.6 i: 3 4 
GC 13.0 Dal 3 4+ 
M [ 9.7 8.9 6.7 2.1 
B TG-F-S 13.7 67 9.4 5.2 
SGS-SH 16.3 S51 11 3 
C SGA-1-SH 13.8 5.2 9 3 
SGA-2-SH 15.8 6.9 11 3 
D MGH 15.7 4.3 14.1 3.5 
E MGW-F 1 Vd Sa 12.8 LS 
MGW-H 2 16.8 S37 13.5 1.6 
b) Cu-wire 
PEN TH AP 3.13 Development of bonding techniques”? 
heaiae Type asuae To achieve a high pin count, various bonding 
strength (g) | Elongation (%) techniques have been developed as described 
: TC-A 17.5 11.6 below. 
TC-D 18.0 11.8 1) Aluminum wedge bonding 
26 10.6 72 Bonding speed is a major problem for this 
B 28 163 15.8 bonding technique. The speed was 0.6 s/wire a 
32 17.6 15.2 few years ago. It is currently 0.35 s/wire. 
Cc HO50 17.9 17.4 Aluminium wedge bonding is used for up to 256 
D MCR 16.1 18.5 pins. But its bonding speed is insufficient for 
E EPCU 15.8 16.9 mass production when aluminium wedge 


For the plastic package, the outer lead pitch 
has been reduced, as described below. 

The conventional QFP160 pin, having a 
pitch of 0.65 mm, formerly had the smallest 
pitch available. But recently, a QFP208 pin 
having a ptich of 0.5 mm has been manufactured. 
To achieve this, improvements were made main- 
ly in the cut-off metal mold and bending metal 
mold structures. The socket technique for test- 
ing was also developed. To develop a QFP having 
300 or more pins, the lead pitch must be 
0.4mm. To achieve this lead pitch, it is neces- 
sary to first improve the lead frame stamping 
precision, improve the cut-off metal mold and 
bending metal mold structures for the lead 
frame, and develop the test socket. 
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bonding is used for a high pin count. It is 
necessary to develop a bonder having a bonding 
speed of about 0.2 s/wire. 

2) Gold nail head bonding. 

The first problem with this bonding tech- 
nique is wire flow. When the materials listed in 
Table 3 were investigated, wire for Full Auto- 
matic Bonders (AF) having higher strength than 
conventional Stress Relieved (SR) wire was 
developed to enable 300 pins to be wired. 
However, when 300 or more pins are wired, the 
space between the wires becomes smaller and 
the wires are liable to touch each other due to 
wire flow. To solve this problem, a wire having 
higher strength is being developed. A_ high- 
precision wire bonder must be developed to 
prevent the initial curling during bonding from 
causing wires to touch each other when 300 or 
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Fig. 10—Sample of after outer lead bonding (OLB). 


more pins are wired. 

A second problem is the bonding speed. The 
bonding speed using ultrasonic thermocom- 
pression was 0.35 s/wire a few years ago and is 
currently 0.25 s/wire. Methods to enable a speed 
of 0.20 s/wire are being investigated. 

The third problem is the wire size. Wires of 
25 ym, 30yum, and 38pm in diameter are 
currently being used. Development of finer wires 
to reduce costs are being investigated. 

3) Copper nail head bonding!” 

Copper wire has the following advantages: 

i) The growth speed of Cu-AlI interlayer 
metal compound is slower than that of 
Au-Al. 

ii) Copper wire has higher tensile strength 
and heat conductivity than Au, thus 
it is advantageous for finer wire. 

iii) Copper wire has a higher Young’s 
modulus and shear modulus and lower 
specific gravity than Au, thus loops are 
not liable to dangle immediately after 
bonding and the wire is not liable to 
flow during molding. 

iv) Direct bonding is enabled for copper 
lead. 

v) Copper wire is less expensive than gold. 
Copper wire therefore has more advan- 
tages than gold wire for achieving a higher 
pin count. The materials listed in Cu- 


438 


Table 4. TAB sample dimensions 


(Unit: um) 

Application Plastic QFP PGA 
Pin count 160 440 
Bump material Au Au (on chip) 
Bump size 75x30 |60x 60x 25 
Lead width 50-70 45 
Cu thickness 35 35 
Sn plating-thickness 0.5 0.5 
Poly imide thickness 719;,125 125 
Tape configulation 3 layer 3 layer 
Inner lead pitch 100 100 
Outer lead pitch 300 100 


wire of Table 3 are being evaluated. One 
problem is that copper wire is easily 
oxidized and an inert atmosphere is 
required during bonding. 

4) Tape Automated Bonding (TAB) 

The TAB technique has the following 

advantages over the wire technique! )-!3); 

i) Both the pattern width and pitch can be 
more easily reduced because the Photo 
Etching method is used in manufacturing 
the lead. 

ii) The bonding time is shorter because the 
gang bonding method is used. 

iii) The bonding strength is higher. 

iv) The electrical characteristics are better. 

v) Flat bonding is possible. The use of this 
technique in various applications is being 
investigated. 

Several examples of use are shown below. 

First, the use of this technique for a plastic 

package having 200 or more pins is being 
examined. After Inner Lead Bonding (ILB), 
Outer Lead Bonding (OLB) is performed on a 
lead frame without a stage, then the normal 
process is performed. Figure 10 shows a sample 
after OLB. Table 4 lists the specifications of 
QFP. The major problems of this application are 
lead stability during molding, thermal stress 
performance, and moisture performance. 

Second, the application of this technique 

to Plastic PGA (PPGA) is being examined. 

This technique has the following advantages: 

i) The bonding strength for a Cu lead 
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Table 5. Thermal expansion coefficient of ceramics 


Material Resistivity (Q-cm) Dielectric constant T. E. C. (10/°C) A (Cal-cm:s-°C) 
Alumina >10 8.5-9.5 6.0-6.5. 0.03-0.04 
“SiC >4 x10 40 35 | O34 
SI3N4 >10 7.5 2.8-3.2 0.03-0.05 
AIN >10 8.9 5.7 0.14  — 
Mulite >10 rv: 3.0 0.017 
Glass ceramics >10 5.4 4.8 0.005 
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Fig. 11—Test die after inner lead bonding (ILB). 


(50 um x 35 wt) is about 50g, higher 
than the strength for gold wire (FA 
30 um), which is about 15 g. 
ii) Flat bonding is possible. However, 
scrubbing is impossible because ILB has 
already been performed during die bond- 
ing, and tape die material is required. 
Third, as an example of high-density assem- 
bly, a prototype package (TAB) having a pitch 
of 4mil and 484 pins was manufactured. 
Figure 11 shows the configuration of an ILB 
sample. Table 4 lists the specifications of PGA. 
Using TAB to manufacture a package having a 
pitch of 2 mil to 3 mil is being investigated. 
Major problems of this application are the 
development of fine pattern tape, the develop- 
ment of a new bonder, and the development of 
a thermocompression jig and tool. 

Fourth, the technique for resin coating after 
OLB is being examined'*). After OLB, the 
package is secured on the carrier and is coated in 
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resin by potting for final shipping. 
Current problems are: 
i) Improvement of mechanical strength and 
stress performance of bonding, 
ii) development of a sealing method, and 
iii) development of a new seal material. 


3.2 Large chip technology 

Figure 9 shows the relation between the pin 
count and average chip size. 
1) Ceramic package 

A Gate Array about 15 mm square is current- 
ly mounted on the PGA400 pin. It is possible 
that a 20 mm square chip can be mounted in the 
near future if the stresses can be reduced. 

First, the stage warp must be reduced to 
reduce the stress. To reduce the stage warp, the 
technique for handling the green sheet must be 
improved to decrease uneven shrinkage during 
firing. 

Second, the die material requires lower 
stress. Although Au-Si eutectic and glass with 
a low-melting point are currently used, Ag glass 
paste, Ag-PI paste and AI-Pl paste are being 
developed. 

Third, ceramics having a low coefficient of 
thermal expansion must be developed. The 
ceramics listed in Table 5 are being examined. 
AIN is the most promising ceramic!5). Glass 
ceramic can be fired at low temperatures, ena- 
bling Au, Ag, Ag-Ad, Ag-Pt, and Cu to be used 
as wire materials. These facilitate refining the 
pattern and increasing the signal speed. 

2) Plastic package 

A 10mm square G/A is currently mounted 
on the QFP160 pin. It is possible that a 15 mm 
square chip can be mounted in the near future. 
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Factors 
1) First factor : Humidity from resin bulk and resin/lead- 
frame boundary 
2) Second factor: Decrease in resin strength at high 
temperature T. E. C. over T, 
Decrease in adhesive force at high 
temperature 


Package crack process 
Humidity absorption Storage in storehouse 

Diffusion 

Solder dip 


Vaporization of water 


Through resin bulk 

260°C, 10s 

Vapor pressure 

Between resin and lead-frame 


Delamination 
Package expansion 

Expansion 
Package crack 


Crack 


Fig. 12—Package cracking. 


The plastic package has more serious problems 
than the ceramic package because the seal 
material comes into direct contact with the chip. 

The first problem is to improve the thermal 
stress. 

A method for reducing the stresses in the 
seal resin is being researched. The Young’s 
modulus of epoxy resin was reduced from 
1300 kg/mm to 1 250kg/mm by transforming 
the epoxy resin using a soft polymer such as 
silicone resin, improving the dispersion of the 
transformed resin in the epoxy resin, and en- 
hancing the chemical bonding between the trans- 
formed resin and epoxy resin. 

A method for reducing the stress in the Ag 
paste for die bonding is also being researched. 
The Young’s modulus of Ag paste was reduced 
from 300 kg/mm to 50 kg/mm by improving the 
molecular structure of epoxy resin and trans- 
forming the epoxy resin using a soft polymer. 

A means of making the Ag paste into a film 
is being investigated. Ag paste is made into 
a film by changing the epoxy resin of Ag paste 
into the B-stage. A film has a constant die bond- 
ing layer thickness enabling the die to be bonded 
uniformly and eliminating localized stresses. 

The second problem is to improve the solder 
dip resistance!®!7 Figure 12 shows QFP 
cracking due to the solder dipping. To prevent 
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Solder-plate Ag-plate Dimple Groove 


Ni-plate 


Fig. 13—Sample model patterns. 


the package from cracking, a resin having a 
lower humidity absorption rate is used. Also, the 
strength of the seal resin, T,, and adhesion to 
the lead frame are being improved. The lead 
frame pattern and structure are also being im- 
proved to increase their adhesion to resin. 
Figure 13 shows the sample model to be used in 
the experiment. Dimples 200 um _ square x 
150 wm deep are made by stamping in a regular 
matrix on the back of the die stage. This increas- 
es the adhesion of the resin to the back of the 
die stage and reduces package cracking. 

The third problem is to improve the moisture 
resistance. Improvement of the lead frame 
pattern and structure and higher purity of the 
seal resin are being examined as shown in Fig. 13. 
GROOVE indicates that several grooves (e.g. 
100 wm wide x 50 um deep) are set on both 
sides of each lead. This improves the adhesion of 
resin to the lead frame and makes the interface 
longer which reduces the humidity absorption 
rate. 

The fourth problem is moldability. On large 
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Table 6. Resin development 


—_—_—_—_—_————————————— 


Rs Item Factor Measure 
Humidity Resin composi- 
absorption tion improve 
Humidity Adherence with Cunel t 
resistance lead frame CUVEE ee 
eR , Purification of 
Ionic impurity components 
Elastic Seca 
coefficient Plasticizer 
Ss az 
<= | Thermal Thermal expan- | ,. 
| 
= stress sion coefficient Filler content 
3 : : Reject largesize 
4 | Filler size 4 
filler 
Resin strength Resi : 
at high esin composi- 
temperature tion improve 
Solderability | Thermal expan- | Filler content 
sioncoefficient| increase 
Thermal Filler content 
| conductivity increase 
a-particle Uranium | Filler ek 
content | purification 
ee : ‘ Resin composi- 
Fluidit Visc ; : 
y lc tion, filler 
2 Hiilnesc | Gel-time, Resin composi- 
2 viscosity tion, catalyst 
3S | Cure | Cure time Catalyst content 
s Mold release | Releasability Wax content 
Mold stain Mold stain Plasticizer, wax 
Marking | Package surface Plasticizer, wax 
stain 
Shelf life Degradation Catalyst 


chips, there tends to be voids, incomplete filling, 
the wire tends to flow, and the stage tends to 
shift. The incidence of the molding failing also 
increases. To prevent the molding from failing, 
a means to improve the moldability of the resin 
and the metal mold structure, and a pattern that 
increases the distance between each lead is being 
examined. 

Seal resin is mainly being developed as a 
countermeasure for the plastic package!®). 
Table 6 lists the current state of resin develop- 
ment. 

First, it is important to improve moisture 
resistance. The humidity absorption rate can be 
reduced by improving the epoxy resin composi- 
tion, enhancing the adhesion with the lead frame 


FUJITSU Sci. Tech. J., 24, 4, (December 1988) 


M. Sono: Packaging Technology for ASICs 


Table 7. Mounting area of packages 


ee ee 
umber | Disiags | e ey G 
DIP 2.54 187 100 
56 SOP 1 | 98 | 52 
ZIP 127 74 40 
| ie me, 84 | 45 
DIP 2.54 460 100 
SL-DIP 2.54 308 67 
52 SK-DIP | 2.54 226 49 
SOP 1.27 119 | 26 
Lcc 1.27 3 16 
soy. | 1.27 135 29 
SH-DIP | 1.778 1104 100 
8 PGA 2.54 691 63 
QFP 1.00 428 39 
[| icc 1.02 337 31 
are 2.54 | 2581 100 
1.27 630 24 
ses ore 0.508 1909 | 74 
0.4 1186 46 
TAB 0.15 169 
Fc. | 05 169 


by improving the coupling agent, and reducing 
the impurities by making epoxy resin, hardener, 
and filler components of higher purity. 

Second, it is important to reduce the 
problems caused by thermal stress. A lower 
Young’s modulus will reduce the thermal stress. 
This is done by adding an agent for increasing 
flexibility, reducing the rate of thermal ex- 
pansion by increasing the filler content, and 
reducing the stress by removing coarse filler!” . 

Third, it is important to improve moldability. 
Moldability is improved by reducing the viscosity 
of the melt resin. This is done by improving the 
resin composition and filler size distribution and 
improving the hardness by using an improved 
catalyst. 


3.3 High-density assembly technology?°? 

Table 7 lists the mounting area of each 
package. The LCC, PLCC, and Small Outline J- 
Lead Package (SQJ) are advantageous because 
their mounting area is small and they have a 


441 


M. Sono: Packaging Technology for ASICs 


Table 8. Current or plan shrink packages 


Name SSOP TSOP SQFP TOQFP SQFP 

Pin number 16 | 20 | 24 | 30 | 20 | 28 | 32 | 32 | 48 | 64 | 80 | 100| 64 | 120 208 | 256 | 304 
Pin pitch (mm) 0.65 0.65 | 0.55 | 0.60 0.5 0.5 0.5 

Body size (mm) | 44x50] 44x6.5) 5617.8] 5619.7] 4416.5 80«11.8|100x138) 5x5 | 7x7 [10x10] 12x12|14x14]10x10/14x20| 28x40 [40x40 
Thickness (mm) | 1.2 1.0 “i | i on 
Production Current iz Plan / 4Q:88 Current Plan. Current Plan 


small number of pins. However, it is almost 
impossible to check the state of the inter- 
connecting region after mounting. Also, if there 
are more than 100 pins, the lead pitch cannot be 
reduced and the body size becomes too large 
(currently, the pitch is 1.27 mm). Therefore, it 
is difficult for these packages to become major 
commercial products. Consequently, QFP and 
PGA Surface Mount Devices (SMD) will mainly 
be used in the future. As described above, the 
body size of these packages is made as small as 
possible by reducing the pin pitch. Small, thin 
packages such as SSOP and SQFP have recently 
been developed for mounting on cards and 
modules. Table 8 lists the package series. To 
develop these packages, adhesion between the 
seal resin and the lead frame and resin strength 
were first increased, and the moisture absorp- 
tion rate and melt viscosity of resin were 
reduced. 

As in Fig. 13, the adhesion between the lead 
frame and the resin was enhanced and the 
absorption rate was reduced. 

The strength of the Au wire was increased 
and the bonding conditions were improved to 
set the height of the wire loop to 100 um or less 
to accommodate a thinner package. 

The current wire loop technique cannot 
accommodate higher pin counts or the diversifi- 
cation of future thin packages due to the inner 
lead pitch and wire height. This problem is being 
investigated because the TAB technique is re- 
quired. 


3.4 High-power device technology 

3.4.1 Ceramic package?!) 

Figure 14 shows the relationship between 
the velocity of the cooling air and the thermal 
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Fig. 14—Thermal resistances of several flat packages. 


resistance for a FPT-series package. As the stage 
size increases, the thermal resistance decreases. 
The cooling fins have a very significant effect. 
Figure 15 shows the cross sections of a 
typical package. To improve heat dissipation, 
the chip was directly mounted on the metal 
stage by die bonding. Currently, a gate array 
having a maximum power consumption of about 
10W can be mounted on a PGA having the 
structure shown in Fig. 8a). In the near future, 
devices having a maximum power consumption 
of 15 W to 20 W will be mounted. Consequent- 
ly, the design of the cooling fins is being 
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Cooling fins 
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a) PGA type 


Cooling fins 


b) FPT type 


Fig. 15 —Cross section of ceramic packages. 


improved. 

Cooling methods, such as liquid cooling, are 
also being investigated. 

3.4.2 Plastic package 

Figure 16 shows the relationship between 
the velocity of the cooling air and thermal 
resistance for each package. 

The figure indicates that as the number of 
pins and the stage size increase, thermal resis- 
tance decreases. If the lead frame material is 
changed from iron alloy 42ALLOY (heat con- 
ductivity = 0.036 cal/em-s-°C) to copper alloy 
MF202 (heat conductivity = 0.37 cal/cm-s-°C), 
the thermal resistance decreases by about 50 
percent. Currently, gate arrays having a 
maximum power consumption of about 1 W can 
be mounted on QFP120. In the future, devices 
having a maximum power consumption of 2 W 
to 3 W may be mounted. 

Consequently, the heat conductivity of the 
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Fig. 16—Typical thermal resistance of plastic packages. 


lead frame must be further increased. For 
example, if the lead frame material is changed 
from MF202 to SLF-3 (heat conductivity 
= 0.75 cal/em:s-°C) the thermal resistance will 
decrease by 20-30 percent. 

Heat conductivity can be increased to 3-time 
that of the current value of 0.002 cal/cm-s.°C 
by changing the seal resin filler to a filler having 
a heat conductivity higher than the quartz 
currently being used. This will decrease the 
thermal resistance by 20-30 percent. 

In case the above improvements prove to be 
insufficient, a means to improve the package 
structure, including mounting of an external 
cooling fin, is being investigated. 


3.5 Developing wafer package??? 
Figure 17 shows a sample of the manu- 
factured package. Table 9 lists its specifications. 
First, the problems in future development 
must be investigated by considering the relation- 
ship between the package and system in terms of 
the entire package structure, including the seal 
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Fig. 17—Sample of wafer package. 


and heat dissipation components. 

Second, new ceramics must be developed for 
the substrate material. Ceramics having a high 
heat conductivity, such as AIN and SiC, glass 
ceramics that can be fired at low temperature, 
and mulite ceramics having a lower coefficient 
of thermal expansion are being investigated. 

Third, the bonding method must be re- 
searched. Although the wire bonding technique 
can be used for up to 300 pins, the TAB tech- 
nique will be used for 300 pins to 600 pins and 
the flip chip technique will be used for 600 pins 
or more. 

Fourth, the outer lead structure is being 
investigated. The QFP type is suitable for 300 
pins or fewer and the SMD-mounted PGA type 
is suitable for 300 pins or more. 

In addition to the above considerations, 
methods for using the printed wiring board as a 
substrate and using the silicon substrate as a 
mother carrier must also to be investigated. 


4. Conclusion 

ASIC technology continues to rapidly 
progress and expand, and ASICs are playing a 
major role in IC/LSI. ASIC packages are being 
highly integrated, diversified, and customized. 
Due to the demand for high-density assembly of 
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Table 9. Specifications of wafer scale integration and 
its substrate. 


a) Feature of WSI 


Bump size 10 mil x 10 mil 


Bump pitch | 20 mil 
No. of bump | 725 


Wafer size 4 inches in dia 
Bump material Pb/sn 


b) Feature of substrate 


Material SiO2/BiO2 galss 
T.C.E. 4.1 ppm 

Size 4.65 x 4.65 inches 
No. of pad 725 

Pad pitch 100 mil 

Pad size 40 x 40 mil 


IC/LSI to make electronic equipment smaller, 
the package is being changed from THD using 
conventional DIP to SMD using FPT. The wide 
range of packages offered by Fujitsu meet most 
user requirements. 

However, requests for IC/LSI are increasing, 
and many new devices such as 
1) high-integrated gate array (50000 or more 

gates), 

2) DRAM (16 Mbits to 64 Mbits), and 
3) high-function microcomputer (32 bits or 
more) will be developed. 

Demands for custom packages such as wafer 
packages, modules, and cards are also increasing. 
This trend will continue with the expansion of 
ASICs. Therefore, the requiremens for ASIC 
packages are expected to be more diversified 
and to become more stringent. 

In response to these concerns, this paper 
first described the ASIC packages offered by 
Fujitsu. 

The technological aspects and trends for 
these ASIC packages were also described as 
follows: High-pin-count technology, large-chip 
technology, high-density-assembly technology, 
high-power-device technology, and _ special 
custom package technology. 

As future applications expand, the tech- 
nological problems they present may increase, 
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and there will be a quick response to solve these 
problems. 
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Reliability on Short-Channel MOSLSIs 
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This paper describes the reliability problems associated with the scaling down of MOSLSI. 


Because the power supply voltage is not scaled, the internal electric field increases, Causing 
reliability problems such as degradation of MOSFET due to hot carrier generation and 
electric breakdown of the gate oxide. Furthermore, the high-speed operation and narrow 
metalization width requires the current density to increase, which redues the interconnection 


reliability. 


These problems can be overcome by improving the device structure and fabrication tech- 


nology. 


1. Introduction 

To enable devices to have high performance, 
the channel length of MOSFET and metal- 
ization width has been reduced to about | ym. 
However, this causes various problems in relia- 
bility, such as hot carriers), time-dependent 
dielectric breakdown (TDDB)”?, and electro- 
migration). 
mechanisms, but all result from device scaling 
down. MOSLSI devices are reduced according to 
the scaling rule. According to this rule, the 
dimensions are reduced in equal proportion, 
thus the electric field and current density 
remain the same. However, the supply voltage of 
scaled devices is not proportionally reduced, 
but remains at 5 V. This departure from the 
scaling rule makes it difficult to obtain relia- 
bility in scaled devices. This problem has, to a 
large extent, been solved by the introduction of 
new materials and by improvements in the 
device structure and fabrication process. 

This paper describes the mechanisms of hot 
carrier generation, time-dependent dielectric 
breakdown, and electromigration. The paper 
also presents the countermeasures for these 
problems. 


These phenomena have various 
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2. Hot carrier 
2.1 Hot carrier generation 

Figure | shows the basic structure of the 
n-channel MOSFET. The electrons flow in the 
channel from source to drain according to the 
drain electric field. This flow is controlled by 
the gate. As the channel length is reduced, the 
electric field generated by the drain increases 
and the electrons in the channel become highly 
energized. 

The electrons then have sufficient energy to 
overcome the potential barrier between the SiO, 
and Si, and then enter the gate SiO, in the 
vicinity of the drain. These electrons are called 
channel hot carriers and form fixed electric 
charges. They also increase the interface state 
density and change characteristics such as Vt 
and B. Also, electrons in the channel collide with 
Si atoms and generate electron-hole pairs by 
impact ionization. If the electric field is large 
enough, electron-hole pairs are subsequently 
generated by avalanche multiplication. These 
are called avalanche hot carriers. Avalanche 
hot carriers may obtain sufficient energy to 
enter the gate SiO and change the character- 
istics in the same way as channel hot carriers. 
Experiments have shown that, of the two types 
of hot carriers, the latter cause more significant 
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Source 


Fig. 1—Impact ionization in short-channel MOSFET. 


Gate 


4.60Cl} SiO, 


Source Drain 


n- diffusion 


Fig. 2—Schematic of LDD structure. 


degradation. Substrate current during MOSFET 
operation is also due to avalanche multiplica- 
tion. The generated electrons flow in the drain 
and the generated holes mostly form the sub- 
strate current. 


2.2 LDD structure MOSFET 

To suppress the generation of hot carriers, 
the electric field generated by the drain must 
be reduced. Since the supply voltage cannot 
be changed (because of the need to maintain 
device compatibility), the drain voltage cannot 
be reduced. The electric field is therefore 
reduced by modification of the device structure. 

Figure 2 shows the typical structure of 
Lightly Doped Drain (LLD). The LDD structure 
is fabricated by the self-alignment technique so 
that the offset gate structure developed originally 
for a high breakdown voltage MOSFET can be 
applied to the short-channel MOSFET. In the 
conventional MOSFET, the drain and gate are 
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10-18 


Fig. 3—/,y, and J/g dependence on Vg. 


vertically aligned, but in the LDD structure they 
are separated by the sidewall SiO. The inner 
side of the drain under the sidewall SiO, is 
called the offset region. The offset region is 
lightly doped so that it has a lower impurity 
concentration than the drain. Accordingly, the 
slope of the impurity distribution from the 
drain to the gate and the field strength decrease. 
This considerably reduces the possibility of 
impact ionization. 


2.3 Change of characteristics by hot carrier 

Reliability tests for hot carrier generation 
were performed on the gate voltage so that 
the substrate current could be maximized at 
a high drain voltage. As described in Sec. 2.1, 
the substrate current is due to avalanche multi- 
plication and indicates the degree of hot carrier 
generation. 

Figure 3. shows the typical relationship 
between the gate voltage and substrate current. 
In general, the maximum substrate current is 
obtained at a gate voltage of about half the 
drain voltage. 

Since the carriers trapped in SiO, diffuse 
less rapidly at lower temperatures, the reliability 
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Fig. S—gy, degradation due to avalanche hot carrier. 


is usually tested at room temperature or lower. 
Figure 4 shows the change in Vy, due to ava- 
lanche hot carriers, obtained with the test 
described above. The sample is LDD structure 
with channel length of 1.1 wm and gate SiO, 
thickness of 18 nm. However, as the fabrication 
process was not optimized, the structure used 
is a poor example. The Vy, was changed by 
100 mV at 10° s. Figure 5 shows the £m Change 
for the same test and shows that g,, was reduced 
by ten percent after 10* s. This change in the 
characteristics is because the carrier injected in 
SiO, generates a fixed electric charge, the 
interface state density is increased, and because 
the electrons trapped under the sidewall cause 
the offset region to have a high resistance. 

Since the value of the substrate current 
indicates the amount of hot carrier generation, 
there is a relationship between substrate current 
and the change in the characteristics. The 
substrate current changes greatly depend on 
the channel length and the impurity concentra- 
tion of the offset region. For example, when 
the life time is defined as the period in which 
8m is reduced by ten percent, a linear relation- 
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Fig. 6—Relationship between time-to-failure and /,yp. 


ship between the substrate current and the life 
time can be seen when they are plotted on a 
log/log scale (shown in Fig. 6). 

That is, 


MTF =A X(Igyp) 2, 


where MTF is the average life time, J,yp is the 
substrate current, and A and B are constants. 

The constants depend on the MOSFET 
structure. In this example, B was 3.3. (There 
have been many reports in which B has been 
between 2 and 4.) The relationship between 
the substrate current and drain voltage is given 
by the following equation. 

Tsup = C x exp (—D/Vp ), 
where Vp is the drain voltage and C and D are 
constants. 

The relationship between the life time and 
drain voltage is give by the following equation. 


MIP=E Kexp (FVD), weiss (3) 
where £ and F are constants. 
The life time for practical use can be esti- 


mated from the results of an accelerated test 
using Equations (1) and (3). 
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Fig. 7—Vt, change of LDD MOSFETs. 


These expressions show the extent to which 
the substrate current and supply voltage must be 
reduced in order to obtain the required life time. 
If the substrate current is reduced, the life time 
can be increased. To achieve this, impurity 
distribution of the offwet region was adjusted to 
reduce the electric field strength. However, if 
the concentration is reduced too much, the 
offset region is easily affected by fixed charge. 
Therefore, the distribution must be optimized 
to take these two factors into consideration. 
Figure 7 compares the test results of optimized 
and non-optimized MOSFETs. The optimized 
LDD MOSFET has a stable V4, and is therefore 
the most reliable of the two. 


3. TDDB 
3.1 Breakdown of oxide 

A high-quality silicon oxide film is a stable 
insulator with a breakdown voltage of approxi- 
mately 10 MV/cm. In conventional MOSFETs, 
the gate oxide was about 40 nm. This had a 
breakdown voltage of about 40 V and was ample 
for the 5-V supply. In MOSFETs with a I-um 
channel length, the gate oxide thickness is less 
than 20 nm. If the gate oxide is 15 nm, the 
breakdown voltage is 15 V, which is still three 
times the supply voltage. 

However, problems were found to exist. 
Although it might be expected that dielectric 
breakdown would occur only at the breakdown 
voltage, it was found that breakdown occurred 
when voltages considerably less than the break- 
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down voltage were applied over a long period. 
This phenomenon is called time-dependent 
dielectric breakdown (TDDB). Because devices 
are highly scaled down and complicated, there is 
a possibility of partial thinning of the oxide. 
This partial thinning may be the cause of the 
lowered breakdown voltage. The local defects 
of SiO, become more apparent as the level 
of integration is increased and the total area of 
gate oxide consequentially increases. This is 
considered to be a particularly serious problem 
in DRAM capacitors. 


3.2 TDDB mechanism 

At present, the mechanism of TDDB is not 
fully understood. However, it is known that 
there are two modes of TDDB, defective mode 
and intrinsic mode. 

The defective mode follows the accumulated 
failure function. It has an almost exponential 
distribution and shows a relatively high failure 
rate in a short period. It is assumed that the 
paths through which the breakdown current 
flows and the concectration of the electric 
field are caused by defects such as oxide pin- 
holes, contamination from impurities, and 
abnormal device shape. This type of TDDB 
is considerably reduced by controlling these 
defects. 

The intrinstic mode becomes apparent after 
the defective mode has been sufficiently reduced. 
It follows the wear-type accumulated failure 
function and shows a relatively long life. Even 
when the applied voltage is below the breakdown 
voltage, tunnel current and F-N current flows in 
the oxide. When these currents flow, some 
types of detects are assumed to be generated 
in the oxide and field concentration is assumed 
to occur in some areas. 


3.3 TDDB reliability test 

The TDDB experiment is performed as 
follows. The current or voltage across a MOS 
capacitor is monitored by applying a fixed 
electric field across it. Normally, F-N current 
flows even when the MOS capacitor has not 
broken down. When dielectric breakdown 
occurs, the voltage becomes abnormally low or 
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Fig. 8—Accumulated failure due to TDDB. 


an exceedingly high current flows. 

Breakdown of aMOS capacitorcanbedetected 
by monitoring these changes in voltage or 
current. The same procedure is repeated using 
many samples. 

Figure 8 shows the relationship between the 
accumulated failure rate and the test time for an 
experiment conducted at 70°C using an SiO) 
layer of 18 nm. This diagram uses Weibull paper 
to plot the failure rate distribution. The wear- 
type failure distribution can be clearly seen. 
The shape parameter m is approximately 1.5. 
The life time depends on the electric field 
strength. A weak electric field results in a long 
life time. Figure 9 shows the relationship be- 
tween MTF and electric field strength for the 
same experiment made at 70 °C and 125 °C, 

The life time decreases exponentially as the 
electric field strength increases. The relation- 
ship is given by the following equation. 


MTF=Axexp(—BxE)  sasa- (4) 


where MTF is the average life time, F is the 
applied electric field strength, and A and B are 
constants. 

The value of constant B is calculated to be 
1.8 orders of magnitude per (MV/cm). As the 
electric field strength increases by 1 MV/cm, 
the life time decreases by 1/10'®. 

As is shown in Fig. 9, the relationship at 
70°C and at 125°C appear as two parallel 
straight lines, and so Equation (4) can be ex- 
pressed as follows: 
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Fig. 9—Relationship between MTF and electric field. 


MTF = C x exp (E,/kT) x exp (—B x E), 


where FE, is the activation energy, kT is Boltz- 
mann’s constant x temperature, and C is a con- 
stant. 

The activation energy was found to be 
0.34 eV. 


3.4 Estimation of life time 

The life time for practical use is estimated 
using Equation (5). For usual ICs, the maximum 
operating termperature is 70°C. Assuming an 
applied voltage of 5 V per 10 nm of SiO, then 
E=5 MV/cm. 

From Fig.9, an MTF of 60h at 70°C 
corresponds to an electric field of 8 MV/cm. By 
applying these values to Equation (5), the 
following values are obtained: 

Temperature coefficient: 1 
Electric field coefficient: 
joe | = esn10", 
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and, 
MTF (70 °C, 5 MV/cm) = 2.5 x 10° x 60h 
= 15x10" h. 


This is approximately 1 700 years. In prac- 
tice, however, it is not the average life time that 
is important but the time when the failure rate 
reaches an even lower value. This time, corre- 
sponding to a lower falure rate, can be calculated 
based on the fact that the accumulated failure 
rate follows the Weibull distribution. 

The Weibull distribution is given by the 
following equation. 

F(t) = 1 — exp {— (t/MTF)"| ; i is-e-a O) 


where F(t) is the accumulated failure rate, ¢ is 
the time, and m is the shape parameter. MTF 
here is defined as the characteristic life time 
during which the accumulated failure rate 
reaches 1 — e~'. This definition differs from the 
original one. 

Consequently, the time when the accumu- 
lated failure rate reaches F(t) is given by the 
following equation. 


t = MTF x —loge {1 — F(t)". va a 


For example, for m = 1/5, the time when the 
accumulated failure rate reaches 0.01 percent is 
1/465 of the MTF. Since MTF = 1 700 years for 
the sample shown in Fig. 9, a failure rate of 
0.01 percent corresponds to 3.7 years. For E = 
4MV/cm, the same failure rate corresponds to 
230 years. For high reliability, the oxide must 
be used at a voltage below 1/3 of the breakdown 
voltage. 


4. Electromigration 
4.1 Mechanism of electromigration 

Metalization in IC is generally a polycrys- 
talline thin Al film grown by magnetron sputter- 
ing. When a voltage is applied, electrons flow in 
the Al layer from the cathode to the anode. The 
metal atoms receive kinetic energy by colliding 
with the electrons. Although the atoms are 
normally constrained in the lattice, there is a 
probability that atoms may obtain a thermal 
energy greater than the bonding energy (as 
determined by the Boltzmann factor). If there 
is no current, these atoms only exchange lattice 
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positions and do not cause the flux. However, 
when the electrons contribute their kinetic 
energy, a flow of atoms is generated in that 
direction. This is called electromigration. The 
flux of the atoms can be expressed by the 
following equation, using the current density 
and the Boltzmann factor. 


dn HAT BORE IT, vse xea es (8) 


where Jatm is the flux of metallic atoms, J is 
the flux of electronics, E, is the activation 
energy, kT is Boltzmann’s constant x tempera- 
ture and A is a constant. 

Since metalization has non-uniform dimen- 
sions, current density, and temperature distribu- 
tion, a discontinuity occurs in the flux of atoms. 
At these discontinuous points, various failure 
modes occur such as voids, open circuits, and 
hillocks. Each of these cause the IC to fail. 
Of these problems, the open circuit has been 
of particular importance. Recently, short 
circuit between adjacent metalization caused 
by hillocks have also been reported. 


4.2 Electromigration reliability test 

4.2.1 Open circuit failure test 

Open circuit failure of metalization is 
described first. Since the flux of atoms increases 
with current density and temperature according 
to Equation (8), the test is done at a high 
current and temperature. 

In general, the current density of the metal- 
ization is about 10° A/cm?. Since a blowout 
occurs due to self-heat generation when the 
current density of 107 A/cm? is exceeded, a 
density of 10° A/cm? is used in the test. The 
temperature is usualy chosen to 150-300 °C 
Figure 10 shows the relationship between 
temperature and the open circuit life time 
with a constant current density. The Al sample 
was doped with Si and was 2 wm wide, | wm 
thick, and 800 pm long. There is a linear relation- 
ship between the log of the life time and the 
inverse of temperature. 

This is given by the following equation. 


MTF =A x exp(E,/kT), 


where E, is the activation energy, kT is Boltz- 
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Fig. 10—Arhenius plot of electromigration failure. 
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Fig. 11—Relationship between M7F and current density. 


mann’s constant x temperature, and A is a 
constant. Figure 10 shows EF, = 0.5 eV. Figure 11 
shows a linear relationship in log-log plot be- 
tween the current density and the open circuit 
life time when the temperature is constant. (The 
slope is almost 2.) 

This is given by the following equation. 


MIF =B xr *. 


where B is a constant. 
When Equation (9) and (10) are combined, 
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Fig. 12—Relationship between J? MTF and current 
density. 


the following equation is obtained. 
MTF =CxJ~? x exp(E,/kT), 


where C is a constant. 

When Equation (8) and (11) are compared, 
it can be seen that the open circuit life time is 
inversely proportional to the flux of the atoms. 

E,=0.5 eV is a rather small value for the 
self-diffusion of Al atoms. This is because the 
sample was a polycrystalline thin film and thus 
contains large numbers of grain boundaries and 
crystal defects. Since the bonding energy of 
atoms is low at grain boundaries or defects, 
a low value of activated energy is assumed. 
The open circuit life time can be prolonged by 
reducing the effect of Al grain boundary diffu- 
sion. Figure 12 compares the open circuit life 
times of two samples with different grain sizes. 
The samples used Al-Si and were 0.6 um to 
2um wide, 0.5mm thick, and 2 mm long. 

The sample with the larger grain size has the 
longer life time. This is because the number of 
grain boundaries in large grain sample is smaller 
than in small grain one. For a large grain, the life 
time increases as the metalization width de- 
creases. When the metalization width is reduced 
more than the grain size, the probability of grain 
boundaries is reduced. This phenomenon is 


size CLA) 
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called the bamboo effect. For small grains, the 
life time decreases as the metalization width de- 
creases. If the metalization width is decreased, 
an open-circuit will result due to small Al void 
and the bamboo effect does not occur. There- 
fore, the reliable metalization can be obtained 
by controlling fabrication to grow Al with a 
large grain size. 

4.2.2 Short mode failure test 

Hillocks occur at accumulations of the flux 
of Al atoms. 

This occurs where metalization is connected 
to the diffusion layer and current flows from the 
diffusion layer. The experiment was done with 
the metalization connected to the emitter of a 
bipolar transistor. The bipolar transistor is a 
convenient device for this experiment because 
of its high current drive capacity. A second 
metalization layer is formed on the first layer 
with an insulation layer. The short circuits 
between the two layers was then tested. The 
sample used AI-Si of 4um wide, 0.8 um thick, 
and 20 ym long. The insulation layer is PSG 
of 1 wm thick. Figure 13 shows the relationship 
between times and temperatures of the short 
circuit failures. The y axis is J? MTF as the 
relationship given by Equation (11) is assumed. 
The linearity is worse than for the open-circuit 
failure and £, =1.3 eV. The reason why the 
open circuit failure and the activation energy 
are different is not known. Since the activation 
energy is high, these short circuit failures will 
be exceedingly rare at pratical temperatures. 

4.2.3 Open-circuit failure test with pulse 

current 

In metalization, a current greater than the 
maximum DC current capacity can be passed in 
a pulse. Metalization passing a pulse current will 
have a greater life time than metalization passing 
a DC current equal in magitude to the peak pulse 
current. Figure 14 shows the relationship between 
the duty cycle and the mean time until open- 
circuit failure using a 100-kHz square wave. 
The sample used AI-Si and was 2 um wide, 
0.5 um thick, and 800 um long. Since it was 
expected that the life time would be very long 
if the duty cycle was reduced, the experiment 
was done with a small grain sample with reduced 
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Fig. 13—Arhenius plot of electromigration short circuit 
failure. 
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Fig. 14—MTF dependence on duty factor. 
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Fig. 1S—MTF dependence on frequency. 


life time. The life time is inversely proportional 
to the square of the duty cycle. 
That is 


WTF RA xD", 


where D is the duty cycle, and A is a constant. 
By combining Equation (11) and (12), M7F 
can be expressed as follows. 


MTF =B x (J x Dy? x exp(E,/kT) 
=Bx Java" KOXD(Ea/RT), 22>: (13) 


where Jaye denotes the average current. 

Equation (13) indicates that, for pulse 
currents, the life time is determined by the 
average current with modification and can be 
treated in the same way as DC currents. Figure 15 
shows the relationship between the frequency 
and the life time when the duty cycle is constant. 
It is clear that the life time does not depend on 
the frequency and that Equation (13) is valid 
within the range of 10 kHz to 300 kHz. For 
pulse currents, a higher peak current determined 
by duty factor can be allowed to give the same 
life time for DC. 


5. Conclusion 

The authors have made a series of experi- 
ments on the reliability problems associated 
with the scaling down of MOS ICs. These experi- 
ments included studies on the change of MOSFET 
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characteristics caused by hot carriers, SiO, 
breakdown by TDDB, and open- and _ short- 
circuit failures of the metalization by electro- 
migration. 

By quantitative analysis of the relationship 
of the supply voltage and substrate current to 
the characteristic change, it is found that the 
LDD structure has full reliability against hot 
carrier effects. For the TDDB, guidelines for 
using thin SiO, with high reliability were 
obtained from the relationship between the life 
time of breakdown, temperature, and electric 
field. 

For electromigration, the authors investigated 
activation energy and current density depend- 
ency, furthermore, the relationship between the 
grain size, width, and open circuit life time. The 
authors investigated the open circuit life time of 
pulse operation and determined that the life 
time could be calculated by regarding the pulse 
as a DC current equal to the average current. 
In addition to open circuit failures, short circuit 
failures between the first and second metaliza- 
tion were tested and the activation energy was 
obtained. 

It is concluded from these results that these 
problems can be overcome by improving the 
device structure and fabrication processes. 
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To accurately estimate bipolar circuit performance, two circuit simulation systems using 
a two-dimensional device simulator have been developed and compared. One uses the table 


method and the other uses the direct method. 


The direct method was found to be superior to the table method. Using the direct method, 
the propagation delay time of an ECL gate has been estimated and the influence of extrinsic 
elements and collector impurity concentration on delay time were investigated. 


1. Introduction 

Device simulators are widely used to analyze 
device behavior because a device simulator can 
reflect the device structure exactly and can also 
simulate physical phenomena accurately. Device 
simulators usually examine only the device 
characteristics. The circuit performance is then 
estimated from these characteristics. However, 
when a device is used as a circuit element, 
its performance must be estimated in the circuit 
environment. 

Therefore, a simulation system which can 
estimate circuit performance using a_ two- 
dimensional device simulator is highly desirable. 
One example of this type of simulator is 
MEDUSA”. However, MEDUSA only has a one- 
dimensional device simulator of bipolar devices 
installed. No simulation system which uses 
a two-dimensional device simulator has been 
developed for bipolar circuits. 

For these reasons, two types of circuit 
simulators using a two-dimensional device 
simulator were developed. One uses numerical 
tables for transistor equivalent circuit elements. 
These tables are derived from the static solutions 
of the device simulator. The other simulator 
solves circuit equations directly using transient 
solutions of the device simulator. We call the 
former the ‘‘table method’’, and the latter the 
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“direct method”. 

This paper presents the simulation tech- 
niques of the two methods, a comparison 
between them, and application examples of the 
analysis of propagation delay times of an ECL 
gate. 


2. Simulation method 

Figure | shows the simulation flow to obtain 
circuit performance. A conventional circuit 
simulator such as SPICE” uses a_ transistor 
model. This transistor model has an equivalent 
circuit whose elements are expressed by analyti- 


Device simulator 


3 a y) 


Parameter extraction 


: Conventional method 


] 
2): Table method 
3): Direct method 


Fig. 1—Simulation flow to obtain circuit performance. 
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cal models. The analytical model has many 
parameters that are extracted from the simula- 
tion results of the device simulator. This is 
indicated by arrows () . However, this param- 
eter extraction is often a very elaborate task, 
and important factors essential for high speed 
operation such as high current injection, diffu- 
sion capacitance, and two-dimensional effects 
cannot be modeled exactly by the analytical 
transistor model. 

The table method uses numerical tables 
obtained from the two-dimensional device 
simulator without extracting model parameters. 
It solves nodal equations. This is indicated by 
arrows @). 

The direct method does not use the 
equivalent circuit. It solves nodal equations 
using transient solutions of the device simulator. 
This is indicated by arrows ©) . 


2.1 Device simulator 
Device simulator FLAPS (Fujitsu Laborato- 
ries Analysis Program of Semiconductor devices) 


Calculate table elements 


> 


tNEW — ,OLD 4 At 


MI, A Ml, = Ale 
AVee, 4Vee, 4Vce, 4 Vcr 


Fig. 2—Simulation flow of table method. 
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solves the basic semiconductor equations which 
are shown below. 


div (e€ grad W) = —q(p —n + Np — Na). 


AG Me ae eee umeomnans (1) 
dn/dt = 1/q divJ,+G—R. se (23 
dp/dt = —1/q divJ, +G —R. mins (3) 
Jy =—Qnpyetad bp. nw wee (4) 
Jp =—GQPepetad Gy. eves wwe (5) 
n=nieexp |q/kT (W—@n) |}. «1... (6) 
p=neexp | q/kT (dp —W) } . .....- (7) 


G is the generation rate term, R the recombi- 
nation rate term, ¢, the quasi-Fermi potential 
of an electron, ¢p the quasi-Fermi potential 
of a hole, and nje the intrinsic carrier density. 
These equations are made discrete using the 
finite difference method and are solved using 
the incomplete LU decomposition conjugate 
gradient method®. This program is also 
vectorized for a vector processor. 


2.2 Table method 

Figure 2 shows the simulation flow of the 
table method. Before solving the nodal 
equations, numerical tables of equivalent circuit 
elements are created. In this method, we used the 
equivalent circuit shown in Fig. 3. Capacitances 
and current sources are obtained from the 
solutions of the two-dimensional device simula- 
tor. 

The current tables are obtained from the 
terminal currents of the base and collector. 


Collector 
[e} 


Emitter 


Fig. 3—Equivalent circuit of bipolar transistor used 
in the table method. 
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To make capacitance tables, total charge Q in 
the device is obtained from the following 
relation. 


Q= f pds. 


From the derivative of this total charge with 
respect to the terminal voltages, two capaci- 
tances can be obtained. 


Cep = AQ/AV,,- 
Cop = AQ/AV ey 
Figure 4 shows the simulation structure of 
a bipolar transistor. Because of its symmetrical 
structure, only half of the device is simulated. 


This structure has an emitter width of 0.05 ym 
and a base width of 0.05 ym. 


b) Base current 


Figure 5 shows two-dimensional numerical 
tables of currents and capacitances. The maxi- 


0.25 0.35 0.1 


Emitter Base 


a ee 


n? 108 


Collector . 
(Unit : wm) 


Fig. 4—Simulation structure of bipolar transistor. 


d) Base-collector capacitance 


Fig. 5—Numerical tables used in the table method. 
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mum collector-emitter voltage of 2.5 V was 
divided into twelve increments and the maxi- 
mum base-emitter voltage of 0.9 V into seven- 
teen increments to create the two-dimensional 
numerical tables of base current /g and collector 
current /c¢c. Capacitance tables having 17 x 7 
increments were created in the same way. 
Irregularly spaced meshes were used to obtain 
high accuracy in the high current region. Values 
of points between the increments are obtained 
by logarithmic linear interpolations. Using a 
vectorized program, it took 33 min of CPU 
times on a FACOM VP-100 vector processor to 
generate these numerical tables. 

Extrinsic components such as the extrinsic 
poly Si base resistance and a collector-substrate 
capacitance are externally attached to the 
equivalent circuit of Fig. 3. Generally, nodal 
equations are expressed as follows. 


21 = 0, G= 1, N). 
] 


N is the total node number and /j is the 
current which flows into the i-th node. These 
equations are made discrete with respect to 
time. Discrete time steps are between 0.2 ps and 
5 ps. At each time step, these equations are 
solved using Newton’s method. 

Newton’s method linearizes the non-linear 
currents of a bipolar transistor as follows. 


dlp dlp 
Ig =Ipo + ——AV pe +—— AV ce. 
E Vcr 


In these equations, derivatives of the cur- 
rents with respect to voltage are obtained from 
the numerical tables using logarithmic linear 
interpolation. 


2.3 Direct method 

Circuit behavior is essentially a transient 
phenomenon. Therefore, using transient solu- 
tions of a device simulator is the best method to 
simulate circuit behavior. Figure 6 shows the 
simulation flow of the direct method. In this 
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Fig. 6—Simulation flow of direct method. 


Table 1. Bias conditions for transient simulations 


Terminal Voltage Terminal Current 
Base Collector Base Collector 
Vee) Vcr (t) 1B Ici 
Vpr(t) + 
. Vcg(t) TB2 Ic2 
AVR3E 
Vpr(t) Vce(t) + I I 
BE AVcE B3 C3 


method, currents and current derivatives are 
obtained from the transient solutions of the 
two-dimensional device simulator and are used 
to solve nodal equations. 

First, semiconductor basic Equations (1) to 
(7) are solved transiently for the three kinds 
of terminal voltage for each transistor by in- 
creasing the time Afr to obtain the base and 
collector currents. Bias conditions for transient 
analysis are shown in Table 1. Using these 
currents, derivatives of the currents with respect 
to the terminal voltages are obtained as shown 
in the following Equations (14) to (17). 


dlp _/p2 —/p1 
avez AVE 
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dlp _/53 —/p1 


eee rete (15) 
OV cE AVcr 
We _fo —Ja (16) 
Wa Me 2 | 
We _/a —Ja (17) 


Ven Ales 


The AVpgr used is 0.01 V and AVcr is 
0.05 V. These calculations are performed for 
each transistor. The nodal equations are solved 
using these currents and current derivatives. This 
procedure is repeated until the nodal equations 
converge. If nodal equations converge, the time 
is increased by another A‘, and the same calcula- 
tions are repeated. 

The table method uses a quasi-static ap- 
proach. This means that nodal equations are 
solved using static solutions of the device simula- 
tor. The direct method uses a non-quasi-static 
approach. This means that nodal equations are 
solved using the transient solutions of the device 
simulator. Therefore, the direct method is 
considered to be superior to the table model 
because a quasi-static approach is not used, and 
purely transient phenomena can be included. 


2.4 Determination of circuit elements 

To compare circuit speed, the logic swing of 
an ECL gate was kept a constant value. In these 
simulation systems, resistance in an ECL gate for 
a fixed logic swing can be automatically deter- 
mined by DC solutions of the table method 
when the transistor size is given. 


3. Examples of analysis 
3.1 Comparison of two methods 

Assuming that exact solutions are obtained 
by the direct method, the accuracy of the table 
method was examined. For comparison, the 
difference between propagation delay times of an 
ECL gate was investigated. Propagation delay 
times were obtained from the simulation of the 
three-stage inverter chain shown in Fig. 7. The 
ECL gate has an emitter follower, the supply 
voltage is —3.1 V, emitter follower supply volt- 
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Fig. 7—Three-stage ECL gate inverter chain. 


—0.5 
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Fig. 8—Output waveforms of inverter chain. 


age is —1.8 V, and reference voltage is —1.2 V. 

Figure 8 shows the simulated output wave- 
forms of this inverter chain using the direct 
method when a step voltage having a fall time is 
50 ps was applied to the input terminal. The 
propagation delay time per gate was obtained 
from the time difference between the output 
voltages of the first and third stages. In this case, 
the propagation delay time per gate is 57 ps and 
power consumption per gate is 3.08 mW. It 
takes 3.7 h of FACOM VP-100 CPU time to 
obtain this result. The dependence of power 
consumption on propagation delay time for 
the two methods shown in Fig. 9 was obtained 
in this way. 

The table model results show smaller prop- 
agation delay times than the direct method in 
the entire power region. The difference between 
the two methods is larger in the high power 
region than in the low power region. This is 
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Fig. 9—Comparison of propagation delay time for 
the table method and direct method. 


because the table method cannot estimate the 
time delay in the extrinsic base region exactly. 
This is because the influence of the extrinsic 
base resistance is already included in the current 
tables, and the extrinsic base-collector capaci- 
tance is included in capacitance Cx. 

In conclusion, the table method has a large 
discrepancy in the propagation delay time when 
the delay is determined by the time delay of 
extrinsic transistor elements. 


3.2 Influence of extrinsic elements 

The influence of extrinsic elements was 
examined using the direct method. Figure 10 
shows the relation between power and prop- 
agation delay time when the extrinsic poly Si 
base resistance and/or collector-substrate capaci- 
tance are changed. 

In the low power region, the charge and 
discharge time of the collector-substrate capaci- 
tance is the dominant factor in determining the 
propagation delay time. In the high power 
region, the charge and discharge time of the 
input capacitance is about the same. 

By decreasing the extrinsic resistance, the 
delay time in the high power region decreases. 
This is because of the reduction in the time 
delay in the extrinsic base region. Decreasing 
the collector-substrate capacitance decreases the 
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Fig. 10—Dependence of propagation delay time on 
extrinsic device 


delay time in the low power region. From 
these analyses, it is concluded that decreasing 
the collector-substrate capacitance is more 
effective than decreasing the extrinsic base 
resistance to realize a low power, high speed 
ECL gate circuit. 


3.3 Collector impurity concentration depend- 
ence 
Figure 11 shows the relation between 
collector current and cut-off frequency. The 
cut-off frequency fy was obtained by the 
following relation using the static solution of the 
two-dimensional device simulator. 
- t Bile 18 
fr = a, ls ala (18) 
Q is the total charge in the device. The maxi- 
mum cut-off frequency of a device having a 
collector impurity concentration of 10!°/cm? is 
15.4 GHz. However, this frequency increases to 
28.6 GHz when the collector impurity con- 
centration is increased to 1017/cm?. This is 
because the base pushout due to high current 
injection is suppressed by a low collector re- 
sistance. 
Figure 12 shows the relation between power 
and the propagation delay time of an ECL gate for 
various collector impurity concentrations. When 
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Fig. 11—Dependence of cut-off frequency on collector 
current for two collector impurity 
concentrations. 
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Fig. 12—Dependence of propagation delay time on 


power for two collector impurity concentrations. 


the collector impurity concentration increases, 
the delay time increases in the low power 
region because of the increase in base-collector 
capacitance. In the high power region, the delay 
time decreases because of the small diffusion 
capacitance due to suppression of the base 
pushout. 

Thus, increasing the collector impurity con- 
centration increases the maximum cut-off fre- 
quency. However, propagation delay time can- 
not be decreased in the entire power region. 
Also note that a high collector impurity con- 
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centration is not desirable for a circuit using low 
power ECL gates. 


4. Conclusion 

We developed two types of simulation 
systems which can estimate bipolar circuit 
performance using a two-dimensional device 
simulator with the following results. 

The table model using static solutions of a 
device simulator can give exact solutions for DC 
analysis. However, for transient analysis, it gives 
inaccurate results especially in the high power 
region because of the incorrect equivalent circuit 
and the quasi-static approach. The direct 
method is considered to be the most accurate 
method to simulate circuit behavior, but it takes 
much CPU time. 

Using the direct method, extrinsic element 
dependence of the propagation delay time of an 
ECL gate was investigated. A smaller collector- 
substrate capacitance was found to be an effec- 
tive means of realizing a low power and high 
speed bipolar ECL circuit. Also, collector 
impurity concentration dependence on cut-off 
frequency and delay time were investigated. It 
was found that high impurity concentrations are 
an effective way to increase the maximum cut- 
off frequency, but are not effective for obtain- 
ing high speed in the low power region. 

In the short term, improvements in the CPU 
time required for the direct method and the 
equivalent circuit for the table model must be 
achieved. Based on these simulation systems, 
limitations of the transistor models of con- 
ventional circuit simulators must be clarified, 
and the device simulator must be improved for 
smaller devices. 
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This report describes a personal computer based state machine compiler. This system trans- 
forms the finite state machine description, Boolean equations, or truth table into net lists of 


the CMOS gate array or standard cell. 


A new feature of this system is the capability to tune the circuit speed from the user side by 
selecting the number of logic levels when running the system. 


This report includes bench mark results. 


1. Introduction 

As the number of gates in ASICs exceed 
100K gates, it becomes very important to 
shorten the time required for logic design. 
Several silicon compilers for this purpose are 
described. Logic synthesis is examined and 
the algorithms for logic minimization’? 
multiplication of logic levels*”>) are described. 
The rule based approach” is also described 
as a method to control the circuit speed. 

For logic LSI design, LSI functions can be 
divided into three blocks: a data-path block, 
RAM/ROM_ block, and control logic block 
(see in Fig. 1). The data-path block performs 
ALU functions or data register functions. The 
RAM/ROM block stores required data and/or 
programs. These data-path and RAM/ROM 
blocks normally have a regular pattern structure 
such as the bit-slice oriented approach. 

On the other hand, the control logic normal- 
ly consists of a random pattern structure and 
must be independently designed for each logic 
LSI. This means that the data-path or RAM/ 
ROM blocks can be repeatedly used for many 
logic LSIs which thus enables the use of function 
libraries. However, for the control logic, it is 
difficult to reuse previous designs. 

In addition, specifications of the control 
logic can only be determined at the end of the 


and 
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x Data-path block 


RAM/ROM block 


a Control logic block 


Fig. 1—Block diagram of LSI. 


logic design phase. Therefore, a logic designer 
has to design this control logic within a very 
limited period of time and without errors. 

Considering all of these factors, we have 
developed the finite state machine compiler 
described in this report. Our approach was to 
develop a finite state machine compiler which 
is capable of controlling the number of logic 
levels by the method of factorization when 
constructing multi-level logic from two-level 
logic representations. Therefore the circuit 
speed/area trade-offs can be controlled using 
this compiler. 
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2. Speed tunable CMOS state machine com- 
piler 

A CMOS finite state machine compiler 
called “‘ZEPHCAD™” has been developed. 
It can be applied to CMOS gate arrays and to 
standard cell ASIC design. 

Input to the system includes finite state 
machine descriptions, Boolean equations, and a 
truth table. 

The system receives these descriptions, 
transforms them into the sum of products 
form (AND-OR_ two-level logic), minimizes 
the AND-OR plane, assigns a binary code to 
each state of the finite state machine, divides 
levels into multiple logic levels, then generates 
the net lists of a CMOS gate array or standard 
cell technology (see Fig. 2). 

A new feature very useful for logic designers 
is that this system can control the circuit speed 
by selecting the number of logic levels when 
they are divided as described above. 

When selecting the number of logic levels, 
a user can simultaneously control the total gate 
size of the logic. Therefore, a user can very 
quickly, determine the circuit speed/area trade- 
offs, and can select the most suitable speed and 
gate size logic for his LSI design. 

This system can also generate the PLA 
pattern after the minimization phase of the 
system flow. 

Our system described here is very useful 
for control logic automated design. A designer 
describes his control logic specifications using 
a language like C, then the system sythesizes 
the gate level circuit automatically. Figure 2 
shows a comparison between our system and the 
usual manual design method. The first step 
of control logic design is to determine the 
specification of control logic. A designer may 
need to specify a state diagram transition and 
logic flowchart. In the case of manual design, 
the designer then writes a gate level schematic 
sheet from the state diagram by translating the 
logic manually. 

In this system, the designer describes the 
state diagram in the form of a software program. 
The system then automatically transforms the 
logic into Fujitsu’s CMOS standard cell or gate 
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Finite state machine 
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Boolean equations 


Manual 


Truth table 
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Logic synthesis 
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Gate level design 


Fig. 2—Comparison between usual and ZEPHCAD ™ 
approach. 


array having a controlled propagation delay. 

The system is operated on Fujitsu’s personal 
computer FMR series which uses an 8 MHz 
i80286 MPU. The operating system is MS- 
Dos!™ v3.1, the memory size is 5 Mbytes, 
and the protected mode is used. 


3. System flow, program functions, and limi- 
tations 
Figure 3 shows the general flow of the sys- 
tem. The functions of each program are 
described below: 


3.1 Input language 

The system has three types of presentation 
forms for input: 

1) Finite state machine descriptions 
2) Boolean equations 
3) Truth table. 

Figure 4 shows an example input description 
in which the sequential logic is specified. In the 
Boolean equations and truth table description 
the combinational logic is specified. 

A special purpose editor was developed for 
this language editing. 
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3.2 Translator 

The translator converts the input language 
into AND/OR two-level logic presentation. (Our 
system was originally developed for PLA design, 
but was extended to use random logic design 
tools.) 


3.3 State assignment 

The state assignment program is applied 
for the finite state machine description. This 
program assigns a binary code to each state 
of the finite state machine. The algorithm of 
this program is called a simulated annealing 
method. First the program assigns a random 
binary code to each state, then it exchanges the 
assigned binary patterns according to the cost 
function. This cost function controls combina- 
tional logic to obtain the minimum number of 
product terms. This program is skipped for the 
Boolean equations or truth table description. 


3.4 Minimizer 

This program takes the ESPRESSO II-based 
algorithm and minimizes the AND/OR plane. 
The ESPRESSO II algorithm was published in a 
text book in 1984"). This algorithm is heuristic 
but is very powerful for finding a well optimized 


Input language 


Finite state machine 
descriptions | ad 
Boolean equations, Truth table 


Translator 


Translate into AND-OR form 


State assignment Simulated 


annealing 


Assign a binary code to method 


state resisters 


Minimizer 
Minimize the AND-OR plane 


ESPRSSO II 
based 


algorithm 
Factorin 
e Branch and 
Divide into multiple bound 


levels of AND-OR plane} approach 
Number of 


levels to be Technology mapping 


divided into is Rule based 
specified here | Assign CMOS gate array | approach 
or standard cell 
y 
PLA pattern of Net lists of Net lists of 


standard cell CMOS gate array standard cell 


Fig. 3—System flow. 
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solution in a short CPU time. Therefore, we 
selected the ESPRESSO II algorithm for our 
personal computer based CAD system. 


3.5 Factoring 

After the minimization, the AND/OR two- 
level logic is transformed into multi-level logic 
and extracts a subexpression common to several 
expressions. By selecting the number of logic 
levels, the designer can control the propagation 
delay and total gate count. 


3.6 Technology mapping 

Finally, the designer selects the technology 
required for his logic device. Today, Fujitsu’s 
CMOS standard cell technology using the 1.2 um 
rule and gate array technology using the 1.5 wm 
and 1.2m rules are available. Only this 
program has a technology dependent feature; 
the programs of 3.1 to 3.5 are technology 
independent. The output of this program is 
the net lists of a cell family of a CMOS gate. 


3.7 System limitations 

Figure 5 shows the limitations of the system 
by showing the maximum numbers that can 
be assigned to the parameters for the input 
language description. 


Finite State Machine Description 


ANDIN : xl, x2; 
OROUT: +out0, +outl, +out2, —out3; 
STATES ; 
SO: IF xl THEN GOTO 82, 
IF x1&x2 THEN GOTO S1, OUT0; 
S1 : WAIT x1 THEN GOTO 83, 
OUTI1; 
52. : GOTO S83, OUT2; 
33. : GOTO SO, OUTS; 
ENDSTATES ; 


nn 


Boolean Description 


BOOLEAN : 
out]: = a&b+c&d, 
t : =a&b&c +d, 
out2: =t+b@c; 


Truth Table Description 


TRUTH: 
0000/0001, 
0001/00x0, 
x100/0100, 
1x00/1110; 


Fig. 4—Input language. 
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1) Maximum number of inputs: 64 
2) Maximum number of outputs: 64 
3) Maximum number of product terms: | 024 
4) Maximum number of states: 256 (8-bit) 

The system can handle an input language 
description having a set of parameters within 
these limitations. A plane size of 64 inputs, 
1024 product terms, and 64 outputs means 
that the circuit can have 1 024 AND gates with 
64 inputs and 64 OR gates of the | 024 inputs. 

When the gate size is represented using our 
Fujitsu CMOS gate array in which the basic gate 
is 2 inputs (NAND), it corresponds to almost 
100K gates. For the PLA plane, the ratio of the 
usage of both AND/OR elements is far less than 
100 percent and is considered to be about 
10 percent. Therefore, about 10K-gate equiva- 
lent logic could be entered in the first language 
description phase, which approximately 
becomes 2K-gate equivalent net lists after 
minimization. 

This is considered to be the maximum size 
that can be handled on this system. 


4. Bench mark results and considerations 

Figure 6 shows our bench mark results. 
The applied technology is the Fujitsu CMOS 
1.2 um standard cell. 

Circuit 1 is a Bending machine. It has four 
inputs, three outputs, and eleven states. 

Circuit 2 is a 4-bit miltiplier (MLP4) having 
eight inputs and eight outputs. 

When decreasing the number of logic levels, 
the propagation delay also decreases, but the 
gate size increases. Therefore, a user can control 
the speed/area trade-offs by selecting the 
number of logic levels when running a multi- 
plication program on the system. 

Presently, the designer must select the 
number of levels the logic must be divided into 
for each design case. Using the system described 
here, however, the selection can be made with- 
out difficulty because either the speed minimum 
solution or the gate minimum solution is given 
automatically by this system according to the 
design selection. 

As shown in Figs. 2 and 3. This system is 
applied to logic synthesis of control logic. 
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Fig. 6—Bench mark results. 


Therefore, a gate level simulation system is 
necessary after the logic synthesis. 

For the personal computer environment, 
we provide the VIEWCAD™ system which can 
simulate a maximum of 20K gates. Thus, the 
ZEPHCAD™ system generates net lists and it is 
linked to the VIEWCAD™ system to create 
a total system for LSI logic design. 

The most powerful feature of the ZEPHCAD'™ 
system is its technology independence. The 
designer specifies his logic, then assigns his 
logic to standard cell or gate array technology 
as necessary. When the technology advances, 
the designer can run the final ‘“Technology 
Mapping” program for his new technology 
design. We conclude that to shorten the logic 
design time, language based and_ technology 
independent tools are becoming increasingly 
important. 
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5. Conclusion 

A finite state machine compiler called 
“ZEPHCAD™” applicable to ASICs, including 
CMOS gate arrays and standard cells, has been 
developed. The system is capable of control- 
ling the circuit speed by selecting the number 
of logic levels and thus selecting total gate 
size. Input to the system can be done using 
a state machine description, the system can be 
done using a state machine description, Boolean 
description, or truth table. The system generates 
net lists of the desired ASIC. 

A set of parameter limitations of the system 
is described and the bench mark test results for 
1.2 um CMOS standard cell technology is also 
provided. 
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